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SECTION  I 


MODEL , PUklUSE,  AND  USES 


INTRODUCTION 


This  section  introduces  covariance  analysis  by  explaining  the  model 
composition,  by  giving  the  purpose  of  the  technique,  by  telling  when  it 
is  applicable,  and  how  it  may  be  used.  No  details  are  presented,  but 
general  statements  of  results  given  in  other  parts  of  the  report  are 
presented. 

In  Section  II  the  theory  for  covariance  analysis  in  the  univariate 
case  with  a single  covariate  is  developed.  Uses,  such  as  adjusting 
treatment  means,  increasing  the  precision  in  randomized  experiments,  and 
obtaining  insight  into  the  nature  of  treatment  effects,  are  explained. 

An  example  using  the  analysis  of  covariance  in  a completely  randomized 
design  with  balanced  data  is  given. 

The  theory  for  applying  covariance  analysis  to  a non-paramctric 
situation  is  presented  in  Section  III.  Only  one  rank  method  is  presented, 
but  others  are  indicated.  The  data  used  in  the  example  are  real. 

MODEL  COMPOSITION 


The  covariance  model  consists  of  classification  type  variables,  as 
found  in  an  analysis  of  variance  model,  and  a continuous  type  variable, 
as  is  usually  found  in  regression  models.  Letting  yn  denote  the  jth 
numbered  observation  in  the  ith  class,  then  in  a covariance  model,  the 
response  yy  would  be  the  result  of  a combination  of  features  from  the 
above  conditions.  For  example,  in  a one-way  classification  with  one 
covariate 


ij 


= Uj_  - 


ft  ( z • • 

1J 


) + e.  • 
ij 


(1) 


where  pf  represents  the  population  mean  of  the 

z.. 


i*h  class  when  z..  equals 
ij 


ft  is  a regression  coefficient  of  y on  z 


is  the  covariate  associated  with  the  ij^  observation 
is  the  overall  mean  of  the  covariates  and 


e 


ij 


is  the  residual. 
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PURPOSE 


Covariance  analysis  is  primarily  used  in  situations  where  one  is 
interested  in  a response  (dependent  variable)  which  is  influenced  by  one 
or  more  covariates  which  cannot  be  or  have  not  been  controlled  by  a 
randomization  scheme.  There  may  also  be  cases  where  the  covariates  have 
been  controlled.  The  covariates  usually  reflect  some  characteristic 
which  is  related  to,  or  influences,  the  response.  This  influence  may 
affect  the  response  directly  or  indirectly  but  does  not  necessarily 
have  to  produce  a cause  and  effect  situation.  For  example,  in  agronomy 
one  may  use  the  yield  of  grain  per  acre  as  a response  and  the  number  of 
plants  per  acre  as  the  covariate.  The  covariate  is  also  known  as  the 
independent  variable  or  the  concomitant  variable. 

PKINCIPAI.  USES 

Covariance  analysis  has  a variety  of  uses  and  its  application  will 
depend  upon  the  investigator's  objective. 

(1)  To  adjust  treatment  means 

Suppose  the  response  contains  contributions  from  the 
treatment  effects,  the  covariate,  and  the  error.  To  correct  or  adjust 
for  the  covariate,  a quantity  equal  to  the  product  of  the  estimated  slope 
times  the  deviation  of  the  mean  of  the  covariate  for  a given  treatment 
from  the  overall  average  of  the  covariate  is  subtracted  from  the  average 
response  of  the  treatment;  i.e., 


adj  y^ 


- 6 


- z ). 


(2)  To  increase  precision  in  randomized  experiments 


Covariance  analysis  converts  the  variance  of  the  responses 
ay>  t0  the  variance  about  regression,  a2  If  a^,  z 5 a2,  then  covari- 
ance analysis  is  considered  to  have  increased  the  precision  and  is  an 
improvement  over  the  analysis  when  covariance  is  not  used.  As  long  as 
the  covariance  model  is  linear,  the  covariance  technique  will  result  in 

the  variance  of  a treatment  mean,  V(ji  ),  being  changed  from  t0 

r 1 n 


n Z (zi.  - z )2 
ij  J 


for  the  univariate  case. 
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(3)  To  remove  the  bias  in  observational  studies 


A researcher,  conducting  a survey,  may  be  faced  with  taking 
a limited  number  of  observations  in  a few  locations.  Also,  these  observa- 
tions may  not  be  randomized.  Snedecor  and  Cochran  (12)  point  out  that 
these  conditions  would  constitute  an  observational  study.  Suppose  a 
researcher  wished  to  study  the  relationship  of  obesity  in  workers  by 
occupation  and  their  physical  activity.  Since  obesity  may  not  be  found 
in  every  worker,  the  researcher  would  have  to  take  his  observations 
wherever  he  can  find  a subject.  Because  of  this,  the  researcher  cannot 
predetermine  a sampling  scheme.  Also,  the  response  obesity  would  probably 
be  measured  as  weight,  a ratio  scale  measure,  but  the  covariable,  physical 
activity,  would  be  measured  on  an  ordinal  scale.  This  may  lead  to  prob- 
lems of  adjusting  the  means  and  in  making  inferences.  Therefore,  if 
another  characteristic,  such  as  age,  is  chosen  as  a substitute  for 
physical  activity,  then  a more  sensitive  comparison  of  obesity  in  workers 
may  be  made  since  age  is  measured  on  an  interval  scale. 

(4)  To  provide  additional  information  on  the  nature  of 
treatment  effects 


Bancroft  (1)  points  out  that  if  treatment  differences 
disappear  after  adjusting  for  the  concomitant  variable,  then  this  may 
suggest  that  the  unadjusted  treatment  differences  are  simply  a reflection 
of  the  treatment  effects  on  the  concomitant  variable.  For  this  reason, 
treatments  should  not  affect  the  concomitant  variable. 

(5)  To  analyze  data  when  some  observations  are  missing 

Covariance  analysis  may  be  used  as  an  alternative  technique 
for  analyzing  data  when  some  responses  in  an  analysis  of  variance  design 
are  missing.  The  computations  of  the  covariance  technique  are  more 
involved  than  other  missing  data  methods,  but  as  Cochran  (3)  and  Steel 
and  Torrie  (8)  indicate,  the  technique  yields  unbiased  sum  of  squares  for 
estimating  all  classification  effects.  The  technique  also  provides  for 
exact  F-tests  to  be  made  on  the  classification  effects. 
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SECTION  II 


COVARIANCE  ANALYSIS  MODEL 


INTRODUCTION 

In  this  section,  the  theory  will  he  developed  for  handling  the 
covariance  analysis  model.  The  model  coefficients,  slopes,  and  means 
will  be  investigated,  and  a test  statistic  will  be  developed  for  testing 
hypotheses  about  these  parameters.  The  assumptions  underlying  the  model 
will  be  presented.  Three  of  the  principal  uses  (adjusting  means, 
increasing  precision,  and  obtaining  information  on  treatment  effects) 
will  be  discussed. 

A method  for  determining  if  the  analysis  of  covariance  procedure 
offers  advantages  over  the  analysis  when  covariance  is  not  used  will  be 
discussed. 

MODEL 

The  model,  as  introduced  in  Section  I,  consists  ofCT  classes  or 
treatments  and  np  observations  within  each  treatment.  Then  i = 1,  2,  •••, 
t and  j = 1,  2,  •••,  np,  where  we  assume  ni  - 2,  and  for  at  least  one 
treatment,  np  ? 3.  The  way  the  model  is  subscripted  indicates  that  each 
treatment  may  be  estimated  by  a regression  line  of  y on  j.  Therefore 
Equation  (1)  may  be  expressed  as 

_ 

y . . = y.  + g.  (z.  . - z ) + e.  . (1) 

until  it  can  be  shown  that  one  slope  is  common  to  all  t regression  lines. 
We  will  assume  the  error  term  to  have  the  following  properties: 

E (e  — ) = 0 for  all  i,  j , 

and  E (e^j  e ^ , ) = a2  when  i = i'  and  j = j' 

= 0 otherwise. 


By  letting  = pi  - gp  z-<5  one  will  obtain  an  easier  model  with  which 
to  work: 


(2) 


' 


ij 


Ti  + 


P,  • z . . 
ri  rj 


+ £• 


Assumptions  for  Analysis  of  Covariance 


Cochran  (3)  lists  two  assumptions  necessary  to  make  covariance 
analysis  valid: 

(.1)  The  design  effect  (blocks,  treatments,  etc.)  and  regres- 
sion effect  are  additive.  If  for  some  reason  they  are  not,  one  may  still 
improve  the  precision,  but 

(a)  The  meaning  of  the  adjusted  treatment  means  may 
become  questionable,  and 

(b)  The  true  difference  of  treatment  means  will  not 

be  obtained. 

(2)  The  residuals  e i j arc  independent  and  normally  distributed 
with  zero  means  and  equal  variance.  The  normality  assumption  permits 
probability  statements  to  be  made  about  the  statistics. 

(3)  Steel  and  Torrie  (8)  include  one  additional  assumption. 

The  covariate  variables  are  measured  without  error. 

Test  for  a Common  Slope 

Upon  the  completion  of  an  experiment  having  a completely  randomized 
design,  one  may  display  the  test  data  as  shown  in  Table  1. 


TABLE  1.  A RAW  DATA  SHEET  FOR  A COMPLETELY  RANDOMIZED  DESIGN  EXPERIMENT 


It  can  be  seen  that  by  having  ni  > 2,  the  data  from  the  ith  treatment  may 
be  fitted  to  the  model  described  by  Equation  (2) . 


We  will  now  derive  a test  statistic  for  testing  the  following  hypoth- 
esis: 

H0:  all  treatment  slopes  are  equal  CBi  = 62  = •••  = 

Hi:  at  least  one  slope  is  different  from  the  rest. 
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i H 


Impressing  Liquation  (2)  in  matrix  notation,  let 


-1  (ni  x 1) 


(n  x 1) 


Tni  x 1J 


(ni  x 1) 


(n.  x 2t) 


0 •••  0 


j'J2  0 


z 0 • • • 0 

0 z 0 

~ ~2 


0 Jut 

~ - i 


0 0 


(t  x 1) 


(t  x 1) 
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We  now  have: 


y = r n + e 

where  E (c)  = 0 

and  E (e  e')  = a2I. 

The  normal  equations  are: 


r'  r n = r'  y 


and 


t"  y 

' (2t  x 1) 


niy  1 . 


Vt. 

a.jXlj 

a2f2j 

J . 


The  normal  equations  for  the  it^1  treatment  can  be  expressed  as 


z.  g.  = n.  y. 

1.1  1 7 1 . 

(3) 

2 ~ 

ij  = J Zij  yi3  • 

(4) 

3 

Multiplying  Equation  (3)  by  z^  and  subtracting  Equation  (4)  from  it  yields 
Cz  zi:?  - n,  qyj  jj  - | Zij  ytj  . n.  yt  ^ . 

Notice  that  (E  z^?  - n^  z^  2)  is  the  corrected  sum  of  squares  for  the 


J 


covariate  in  the  i^h  treatment  and  that  e z. .y.  . -n.y.  z.  is  the 

3 1J  1J  1 1*  1“ 

corrected  cross-product  sum  of  the  response  and  covariate  in  the  i 
treatment.  So  §.  can  be  expressed  as 


• th 


~ 3 

hm 


E (z. . - z.  ) (y. - - y.  ) 
. v lj  i.  ' ij  7 1. 7 


Z (z-  • - z.  ) 2 
j 13  1 . 


(S) 
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and  from  liquation  (3)  , is  found  to  be 


t-  = y.  - 3- z. 

1 J 1.  11. 

Now  calculating  the  sum  of  squares  associated  with  the  model  containing 
each  treatment  mean  and  slope,  one  has 


R (t  t •••»  t+,  6 


d , •••,  6J  = n'  r'  y 

1 t 1 t 


■ ? Wi.  * ? % ? zuy 


ij-ij 


? Vii  * ? ei  <zij  • - y'i.>] 


ni  *2^ 


(zij  • zi.)(yij  ~ yj.Ji 
? (zu  • zi.)2 


For  descriptive  purposes,  R (tj,  •••,  xt,  Bj,  •••  3t)  will  be  referred 

to  in  this  subsection,  as  the  reduction  due  to  the  full  model.  Subtracting 
the  sum  of  squares  of  the  reduction  due  to  the  full  model  from  the  total 
sum  of  squares  in  the  model,  one  obtains  the  residual  sum  of  squares  for 
the  model,  or  Residual  (full): 


Residual  (full)  = y'y  - n'f'y 


■ f.  tyi3  - yi.»! 

i 


[E  (z. - - z.  ) (y. . - y.  ) ] 
4 ij  1.  7ij  } 1.  J 


) (zij  • zi.)! 


Express  the  residual  sums  of  squares  and  cross  products  as 


E ^ = E (y-  • - y . ) 2 
>7  j 7i. 


(i)  . 


zy 


=E(z..  -z.  )(y..  -y.  ) 
j *•  13  1. 
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and  this  sum  of  squares  has  associated  with  it ( n 2t  J decrees  of 
freedom  since  the  rank  of  r'r  is  2t . 

The  model  describing  the  data  may  be  simplified  if  a slope  common 
in  all  treatments  may  be  assumed.  Consider  now  a reduced  model  incorpor- 
ating a common  slope  and  each  treatment  mean: 


’ n * 6 2y  * Eij 


In  matrix  notation,  let  y^,  y,  e.  , t,  and  e be  defined  as  before, 
f and  p for  this  model  become: 


13  (t  + 1 x 1) 


In  the  reduced  equation,  y = F q + e,  one  still  assumes  that  E (e)  = 0 
and  E (e  e')  = a2I.  The  normal  equations  are 
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r'r  n = r'  y 


In  solving  for  3,  one  may  multiply  the: 

Sf  __ 

1‘  row  by  z1#  and  subtract  from  the  last  row 
?nd  i *** 

row  y z2#  subtract  from  the  last  row 


This  leaves  all  but  the  last  term  in  the  last  row  with  zeros.  The 
equation  associated  with  the  last  row  then  becomes 


(ijV  ’f" i zi.J)  6 ‘ i?  zij  yij  ■ *v4.  n. 


or  z [E  (z  . - z )2]  g = i [I  (z. . - z.  )(y..  - y.  )] 
i j i j 13  1J  1- 


Using  the  same  notation  as  before,  we  get 


IE  B = I E 

i zz  i zy 


By  defining 


then 


E = IE  W 

wv  . wv 

1 


(•) 


_2L 


(•) 


zz 


The  numerator  is  the  pooled  (summed)  sum  of  cross  products  in  each  treat- 
ment, and  the  denominator  is  the  pooled  sum  of  squares  of  the  z- ■ 's  in 
each  treatment.  Solving  for  one  obtains  1-) 


A A 


where  g is  an  estimate  of  the  common  slope. 


We  now  need  to  find  the  sum  of  squares  accounted  for  by  the  reduced 
model.  It  will  become  a component  in  the  test  statistic  for  a common 
slope. 


R (ti,  •••  , xt,  B)  = n'  r y 


= l Vi?  + 


(E  (,))2 
v zy  J 


(•) 


zz 
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The  Residual  (reduced)  becomes: 

Residual  (reduced)  = y'y  - n'f'y 


(E  (,))2 
zy  J 

= 2.yij  ' . ni  yi2 m- 

11  J l E _ J 


zz 


= E 


(E  (0)2 

(.)  z> 


yy 


(•) 


zz 


The  Residual  (reduced)  has  associated  with  it  n,  - (t  + 1)  degrees  of 
freedom  (d.f.)  since  the  rank  of  TT  is  (t  + 1) . 

One  can  derive  the  likelihood  ratio  test,  but  an  equivalent  test 
statistic  is  given  by  Uj, 

,,  _ Residual  (Reduced)  - Residual  (Full) 

1 Residual  (Full) 

d.f.  Residual  (full) 

(d.f.  Residual  (reduced)  - d.f.  Residual  (full)] 

and  if  we  assume  e ~ N (0,  o2In) , then  Ui  has  an  F-distribution  under 
the  null  hypothesis.  Therefore,  Ui,  becomes 


Ui  « 


(E  (*))2 
(.1  1 zy  J 

- £ 

E (i)  . 

yy  E (•) 

i 

77  E W 

z z 

L zz 

n.  - 2t 

r 

1 

* t - 1 

C i ) x 2 


r CD  - (K*y 

Lyy  Thn 

Jzz 


and  Uj  ~ F(t  - 1,  n,  - 2t)  when  H0  is  true.  Uj  ~ F(t  - 1,  n - 2t,  A) 
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when  H0  is  not  true,  where  X = 


Test  for  a Common  Mean 


2o*. 2 


l ni(yi  " zi.): 


After  one  has  tested  for  a common  slope,  one  may  then  wish  to  test 
for  a common  treatment  mean,  that  is 

Ho : t,  = •••  - xn  = T 

Hi:  At  least  one  treatment  mean  is  different  from  the  other  treatment 
means . 

In  testing  for  a common  mean,  one  must  consider  the  test  in  terms  of 
what  has  already  transpired;  i.e.,  the  results  of  the  previous  "Test  for 
a Common  Slope,"  must  be  considered.  Therefore,  two  situations  should  be 
considered: 


(a)  Case  1,  where  H„  was  rejected  in  the  test  for  a common 
slope.  The  model  to  be  considered  under  the  hypothesis  for  a common 
mean  is: 


versus 


y. . = x + 8.  z . . + e- . 
y 13  1 11  n 


y.  . = T.  + g.  z.  . + e.  . 
'lj  l 1 11  ll 


1J  1J 


The  test  for  a common  mean  (intercept)  will  depend  upon  the  covariate 
location.  This  situation  will  not  be  pursued. 

(b)  Case  2,  where  H0  was  not  rejected  in  the  test  for  a common 
slope.  The  model  to  be  considered  under  the  hypothesis  for  a common  mean 
is: 


versus 


y. • = x + 8 z. . + e. . 
yij  p ij  ij 


y.  • = t-  + 8 z- • + £. . 
*ij  1 n 11 


(7) 


ij  iJ 


(8) 


It  may  be  noted  that  Equation  (8)  is  the  same  as  the  reduced  model  under 
the  hypothesis  of  a common  slope.  It  now  becomes  the  full  model  for  test- 
ing a common  intercept  and  therefore  the  Residual  (full)  is 
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i 


(•) 


(E  (0)2 


Jyy 


TT 


zz 


We  now  need  to  develop  the  Residual  (reduced)  for  Equation  (7) . 
First,  defining  the  components  of  the  matrix  model  describing  the 
reduced  model,  y^,  y,  z^,  e^,  and  e remain  as  before.  The  other 

components  are  defined  as: 


(n.  x 2) 


ft 


The  matrix  model  is 


’ 311(1  13  (2x1) 


y = T n + e 


where  E (e)  = 0 and  E (ee')  = a2 I. 


T 


6 


The  normal  equations  are 


where 


r'r  (2  x 2) 


n. 

n_ 

/« 

r'r  r 

7 

- r*  y 

— 

n.y.. 

n 7 

2 

z? . 

■ 3,111  r'y  (2xl)‘ 

• • • 

ij 

IJ 

- 

The  normal  equations  may  be  expressed  as 

A A 

n t + n z g = ny 
• •••  * ' • • 


and 


n z x + 2 z-  ? £ = 2 z.  . y.  . 

ij  1J  ij  ^ 
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So  g and  t are  estimated  by 


£ (z. . - z ) (y.  . - y ) 


I (z. . - z ) 2 

ij  1J 


and  t = y # _ - 8 z 


A 

The  sum  of  squares  associated  with  t and  8 may  now  be  found  to  be: 
R (t,  8)  = n'T'y 


= i n . y + 8 I z • • y . . 

ij  iJ  yU 

and  the  residual  sum  of  squares  for  the  reduced  model  becomes 
Residual  (reduced)  = y'y  - n*  I”  y 


= £ y ? - T n y - M z.. 


ij  'ij 


= (yAi  - >•  )2  - 8 [£  (z..  - z ) (y  • - - 7 )]• 


Let  the  following  notation  stand  for  the  respective  sum  of  squares 
and  cross  products: 


T = £ (z. 
Zy  ij  x* 


- ^<<)(yi>  - y ) 


t = £ (z.  - z ) 

zz  . . 1 . . . 

IJ 


T = £ (y.  - 7 V 

yy  . . 7i.  ’ . 


Now  consider  the  following  identity: 


(yy  - y..)  i Cy4.  - y..)  * (y4j  • y4.)  • 

Squaring  and  summing  over  all  observations,  one  obtains 

e Cyii  - y )2  = e (y±  - y )2  + e (y^  - y.  )2 
ij  1J  •'  ij  ••  ij  3 

This  result  is  shown  in  Appendix  A.  Using  the  notation  previously 
given,  the  identity  may  be  expressed  as 


E (y- • - y ) 2 = T + E ^0 

^ yy  yy 


Likewise,  it  can  be  shown  that  the  following  identities  hold: 

£ Cz±i  - ~z  ) 2 = E (z.  - T ) 2 + E (z  - * ): 

ij  13  •*  ij  • •'  ij  13  1‘ 


+ E (0 


E T + E 
zz  zz 


and  that 


E - z ) (y i i - y ) = E (z.  - Z )(y  - y ) 

A A J-J  . . -LJ  • • A A *-  • • • J-  • 


ij 


ij 


+ E [E  (z. . - z.  )(y..  -y.  )] 
• • ij  i.  V7ii  7 1.  1 


i J 

= T + E ^ 
zy  zy 


Thus, 


r 'i  (T  + E 

Residual  (reduced)  = (Tyy  + ^yy^*3) 

(T  + E (,)) 
zz  zz  7 
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with  (n.  - 2J  degrees  of  freedom  since  the  rank  of  the  FT  matrix 
is  2.  The  test  statistic  for  a common  mean,  say  U2,  is  given  by 


U2  = 


n.  - t + 1 

(T  + E 

yy  yy 

(T  + E 
zy  zy  ’ 

- 

E (0  . 

(T  + E 
zz  zz  J _ 

Eyy  TT7^ 

L ZZ 

t - 1 

rE  (•) . (V,)2~ 

77  E H 

zz 

When  normality  is  assumed,  U2  ~ F (t  - 1,  n.  - t + 1)  when  H0  is  true. 
U2  ~ F (t  - 1 , n.  - t + 1 , A)  when  H0  is  not  true  with 


A = 


Z ni  (Pi  ' 3 z.)2 


The  results  of  an  analysis  of  covariance  for  a completely  randomized 
design  are  shown  in  Table  2,  where  DjSS  and  n.  - 2t  are  the  terms  used  in 
computing  the  numerator  of  U1}  and  D2SS  and  n.  - t + 1 are  the  terms 
used  in  computing  the  numerator  of  U2.  (Tw  + Ew)  represents  the  sum 
of  products  for  the  treatment  and  error  terms  for  the  indicated  sub- 
scripts. Draper  and  Smith  (4)  point  out  that  sy2z  is  an  estimate  of  the 
variance  about  the  regression  line  in  each  treatment  when  a common 
slope  is  assumed.  An  estimate  of  the  error  mean  square  will  be  sy2z. 

Adjustment  of  Treatment  Means 

The  formula  for  adjusting  treatment  means  was  presented  in  Section  I 

A /V 

as  being  = y^  - 8 (z^  - z ).  It  is  assumed  that  a common  slope 

was  obtained  for  all  treatments.  Steel  and  Torrie  f 8)  state,  "Adjusted 
treatment  means  are  estimates  of  what  the  treatment  means  would  be  if  all 

zi  's  were  at  z ."  The  idea  is  presented  graphically  in  Figure  1. 


Suppose  the  results  of  two  treatments  are  plotted.  Let  one  treatment 
response  be  represented  by  +'s  with  response  and  concomitant  means  given 

by  Cy  i . , ZiJ,  respectively,  and  the  other  treatment  represented  by  o's 

having  response  and  concomitant  means  (y2t,  z"2-),  respectively.  Let  z 

be  the  overall  concomitant  variable  mean,  ^ be  the  adjusted  mean  for 


TABLE  2.  ANALYSIS  OF  COVARIANCE  TABLE  FOR  A COMPLETELY  RANDOMIZED  DESIGN 


TABLE  2.  ANALYSIS  OF  COVARIANCE  TABLE  FOR  A COMPLETELY  RANDOMIZED  DESIGN  (CONCLUDED) 


(Difference  fcr  Testing  Means  Equal)  t - 1 D2SS  i For 


y 


•1. 


’2. 


Figure  1.  Adjustment  of  Treatment  Means  by  Covariance  Analysis 


Treatment  1,  and  ?2<  be  the  adjusted  mean  for  Treatment  2.  Then  is 

the  estimate  of  what  the  treatment  would  be  if  all  z.  's  were  at  z 

1 • • • 

When  considering  an  estimate  of  the  difference  between  two 
adjusted  treatment  means,  one  would  have 


A A 


- - ®(ii.  - v>  • 


Increase  of  Precision  for  Randomized  Experiments 

Dot  notation  will  no  longer  be  used  for  sum  of  products  associated 
with  the  common  slope  model.  It  can  be  shown,  by  applying  the  results 
in  Table  2 that  the  estimated  variance  of  the  responses  without  covariance 
is  given  by 


and  becomes 


EMS 
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1 (EZy} 2 

EMS  = = r E * ry 

n.  - t - i yy  fi2z 


when  covariance  analysis  is  employed.  The  variance  of  a treatment  mean 
is  changed  from  o2/n  to  Oy*z  [-£—  + (zi#  •*  *,.)  /E  (z-^j  - z#>)  7116 

variance  of  the  adjusted  treatment  means  is  developed  in  Appendix  B. 

The  estimated  variance  of  the  difference  of  two  estimated  adjusted 
means  is 


V (?1.  - ?k.)  - V C5i.  - - 6 (5,.  - £k.n 


= V Cy  - y.  ] + (z,  - \ )!  V (8) 

1 • is.  • 1 • xv  • 


- 2 cov  [(yi>  - yk>),  (zt>  - zk  ) 6] 


It  has  been  shown  in  Appendix  B that  cov  (y.  , B)  = 0.  Likewise, 

A A ' 

cov  (yv  » 6 ) = o.  V (8)  is  also  developed  in  Appendix  B. 

The  above  expression  then  reduces  to 

v ^i.  - = sy!z/ni  + sy?z/nk  + (zi.  ‘ zk.)2  (ziJ  " Zi  } 

_ s z £ — + — — + (z.  - zk  )2/E  3 

y.z  n,-  nv  i*  K*  22 


where  s 2 estimates  a 2 . A disadvantage  of  the  above  form  is  that 
y • z y • z 


V (r,.  - £ ) is  different  for  every  pair  of  treatments  being  compared. 

X • K • 

One  may  then  like  to  have  an  average  value  for  the  variance. 
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Since  the  average  would  be  over  t treatments  taken  two  at  a time, 
the  average  value  for  the  difference  between  two  adjusted  means  is 


t (t  - 1)  . 


£ V (c.  - 


y.z 


i k 
i^k 


Ck.}  ' t (t  - 1) 


i (-^ — + ~ — 
ik\ni  nk 


Ezz  / 


y»z 


t (t  - 1) 


H ? -T-+  - 1) 

i l k k 


+ — a ih.  ~\.)Z 


zz  i k 
i/k 


One  may  now  apply  the  identity  expressing  the  sum  of  squares  of  devia- 
tions about  the  mean  in  terms  of  the  sum  of  squares  of  all  differences; 
that  is, 


E (z  - z)2  = 
i=l 


-k  * <‘i 

l k 
i^k 


- z. 


)2 


_ 1 


0 £ (z. 

?n  ik  i 


This  identity  is  proved  in  Appendix  C.  We  now  have 


t (t  - 1)  . 


£ V (t. 


i k 
iflc 


2s 


t 

E 

i=l 


n. 

l 


+ TTTTTT—  E -5  >2  (9) 

1 ' zz  i=l  * 
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where  z is  the  unweighted  mean  of  the  treatment  means, 


— Z z. 
1 i=l 


The  harmonic  mean,  n^,  is  defined  as 


H t 


i=l  ni 


Using  this,  Equation  (9)  reduces  to 


rtrr-iT  A v <S.  - SJ 


= 2 s 


”H  * i=!  <5i' 


If  the  design  is  balanced,  then  n.  = r for  each  treatment.  This  implies 
that  ry,  = r.  Recall  that 

Tzz  = S ^zi . “ z..)2  = r I (zj  - z )2  . 

iJ  i 


So  employing  these  conditions  we  have 


■j  A A 

nrry  E v U,  - c ) = 

6 1J  i k i*  k. 


(t  - 1)  E, 


as  a final  result  for  the  case  when  n.  = r. 

i 

Efficiency 

Snedecor  and  Cochran  (12)  state  that  a method  for  determining 
whether  analysis  of  covariance  is  more  beneficial  than  the  analysis 


without  covariance  is  to  calculate  the  efficiency.  The  efficiency  is 
defined  to  be 

sy2 

Efficiency  - 


y*z 


1 + 


zz 


(t  - 1)  E 


zz 


The  denominator  is  defined  by  Snedecor  and  Cochran  as  being  "the  effective 
error  mean  square  per  observation  when  computing  the  error  variance  for 
any  comparison  among  the  treatment  means."  The  larger  the  value  of  the 
ratio,  the  more  efficient  is  analysis  of  covariance. 

EXAMPLE  OF  A COMPLETELY  RANDOMIZED  DESIGN 

A tool  manufacturer  markets  three  kit  sizes,  each  consisting  of 
seven  bits.  The  amount  of  alloy  added  for  hardness  varies  in  each  bit 
because  of  bit  design  and  of  kit  size:  small,  medium,  and  large.  The 
manufacturer  is  interested  in  finding  if  the  life  expectancy  of  the  kits 
as  a whole  are  the  same.  Each  bit  was  mounted  and  subjected  to  material 
of  like  density  for  equivalent  lengths  of  time.  It  was  decided  that 
the  quantity  of  alloy  (z)  added  for  hardness  would  influence  the  amount 
of  wear  (y) . The  test  results  are  presented  in  Table  3. 

In  testing  for  a common  slope,  one  would  want  to  determine  if 
the  model 


y.  . = t-  + gz..  + e.. 
7ij  i p ij  ij 

can  predict  the  same  results  as 

y . . = t • + g . z • ■ + e-. 

7ij  l ij  ij 

Therefore  the  hypothesis  will  be 
H0:  Bi  = B2  = 83  = B 


II) : g)  f Bj  for  at  least  one  i and  j 


Calculations  for  the  residual  sum  of  products  will  be  shown  for  the 
small  kit  only. 
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TABLE  3.  DATA  TABLE  FOR  EXAMPLE  1 


Small 

Medium 

Large 

Alloy 

Wear 

Alloy 

Wear 

Alloy 

Wear 

z 

y 

z 

y 

z 

y 

Milligrams 

Millimeters 

Milligrams  Millimeters 

Milligrams 

Millimeters 

15 

33 

28 

2l* 

1*0 

16 

16 

31 

31 

22 

1+3 

ll+ 

19 

31 

3b 

23 

1*8 

13 

22 

30 

38 

19 

50 

11 

2h 

29 

1*0 

20 

53 

11 

25 

27 

1*3 

17 

55 

9 

32 

26 

1*6 

18 

58 

9 

TOTALS 

153 

207 

260 

ll*3 

3l*7 

83 

GRAND 

TOTAL 

760 

1*33 

CROSS 

PRODUCT 

Total 

UUU3 

5217 

1*016 

SUM  OF 

SQUARES 

3551 

6157 

9910 

2963 

171*51 

1025 

Zb 


- e 

J 

= 3551  - (153)2/T 
= 206. 857143 


(0 


yy 


= 1 yh  - (Eyij)2ni 

J 


= 6157  - (207)77 
= 35-714286 


zy 


(»)  = 


E U1}  - i1.)(ylJ  -yu) 


' ^ ZiJyiJ  - 'J  ZiJ)(5  yiJ>/ni 


= 4443  - (207)(l53)/7 
= -81.428571 


Now  solving  for  an  estimate  61,  the  slope  for  the  small  kit 


O) 


e,  = 


vr 


(O 


zz 


= -81.428571/206.857143 


and  for  the  adjusted  sum  of  squares 


ADJ  G.£. 


<C’>2 


- 35-714236  - (-81. U28571  )2/20d. 85711)3 
= 3.660222  . 


The  residual  sum  of  products  are  presented  in  Table  4.  The  test 
statistic  shows  a value  of  0.0421.  Comparing  this  to  a tabulated  F 
value,  we  have 

F(2,  15,  a = .10)  = 2.70  . 

One  would  not  reject  Ho  at  this  level.  Therefore,  accept  the  model  that 
estimates  the  three  sets  of  data  by  a common  regression  slope  but  possibly 
having  a different  intercept. 

The  adjusted  and  unadjusted  means  are  presented  in  Table  5.  Com- 
paring the  unadjusted  means  for  the  kits,  one  may  conclude  that  the 
average  amount  of  wear  between  the  large  and  small  kit  is  signif icant , 
but  looking  at  the  adjusted  means  for  the  same  kit  sizes,  the  difference 
in  the  amount  of  wear  has  been  greatly  reduced.  The  adjusted  means  are 
estimates  of  what  the  average  kit  wear  would  be  if  compared  on  the  basis 
of  each  kit  having  the  same  amount  of  alloy. 

The  interest  now  is  to  determine  if  the  average  life  expectancy  of 
the  three  kits  is  the  same.  The  hypothesis  is 

Ho:  T = T2  = 13  = T 

H : Ti  ^ Tj  for  at  least  one  i and  j . 

In  calculating  the  sum  of  products,  only  the  cross  products 
will  be  shown. 


TABLE  5.  MEAN  - VARIANCE  TABLE 


.1 


Table  of  Unadjusted/Adjusted  Means 


Small 

Medium 

Lar^e 

Unadjusted 

29.6  mm 

20.  L mm 

11.9  mm 

Adjusted 

2L.0  mm 

20.8  mm 

17.0  ram 

Unadjusted  Mean  = 

*i. 

A 

^ = 

y.  - 3 (5. 

- z ) 

i. 

1.  1. 

• • 

ESTIMATED 

VARIANCES  FOR  THE 

ADJUSTED  TREATMENT  MEAN  WITH  COMMON 

Small 

Medium 

Large 

0.3243  mm2 

0 . 1168  mm2 

0.2976  mm2 

V (l.  ) = s 2 

[—  + (z.  - 

z )2/I  (z  - z )2] 

1.  y*z 

n 1 . 

••  ij  . . 

30 


I 

ij 


) 


- h ' v % v 


= 13676  - 2T'(760)  (433) 
= -1994.47619  . 


tzY  = E (zi.  - zJ(yi.  * 

■-5T  E --TT  <£.  V 

1 • • IJ  J IJ  J 

= -J-  (97652)  - — (760)  (433) 

= -1720.19048  . 


The  results  are  tabulated  in  the  analysis  of  covariance  table,  Table  6. 

The  test  statistic  gives  a value  25.848  and  the  tabulated  F value  for  two 
and  17  degrees  of  freedom  (d.f.) .for  a = .10  is  2.64.  The  null  hypothesis 
is  rejected,  and  the  manufacturer  concludes  that  the  expected  life  for  at 
least  one  kit  is  different  from  the  rest. 

An  increase  in  the  precision  of  the  responses  can  be  seen  by  comparing 
the  estimated  variance  without  covariance,  6.75,  to  the  estimate  of  the 
variance  about  regression,  0.7199  (see  page  2).  The  estimated  variance  asso- 
ciated with  each  adjusted  treatment  mean  is  presented  in  Table  3. 

If  one  wished  to  consider  the  difference  between  two  adjusted 
means,  such  as  and  £3,  then  the  difference  is  estimated  by: 

Ui  ' h)  = 38.10  - 31.19  = 6.91, 


while  an  estimate  of  the  average  variance  would  be 
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t (t  - 1)  . 


X V - -3)  = 


i k 
i^k 


1 + 


(t  - i)  i: 

' zz 


2 (.7199)  |\  ^ 2697.80953  1 

7 [ (2) (709.428576)  J 

= 0.5968  . 

One  may  now  like  to  see  how  efficient  covariance  analysis  was: 

T 


E = -4- 

_ 6.57 
TTo^ 


1 + 


zz 


- tf  Ezz 


» 3.15 


Analysis  of  covariance  with  10  replicates  per  treatment  will  give  estimates 
of  treatment  difference  which  are  just  as  precise  as  32  replicates  per  treat- 
ment without  covariance  analysis. 
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SECTION  III 


MULTIVARIATE  COVARIANCE  ANALYSIS  MODEL 


INTRODUCTION 

In  this  section,  the  analysis  of  covariance  will  be  extended  from 
the  univariate  case  to  the  multivariate  case.  The  multivariate  case  to 
be  considered  is  one  where  there  is  a single  response  and  more  than  one 
concomitant  variable.  Complete  development  of  the  theory  will  not  be 
presented;  only  enough  will  be  presented  to  tie  in  with  what  was  presented 
in  Section  II.  An  example  displaying  the  multiple  covariance  technique 
in  a randomized  block  design  with  unequal  sample  sizes  will  be  presented, 
and  a test  statistic  for  the  hypothesis  of  no  interaction  will  be  derived. 

The  case  of  many  responses  and  many  concomitant  variables  will  not 
be  considered.  Morrison  (8)  gives  a brief  account  of  this  case.  Hazel 
(6)  presents  an  analysis  of  covariance  for  multivariate  data  with  unequal 
subclass  sizes.  The  data  is  presented  in  a regression  type  of  analysis 
of  variance  table  with  no  indication  of  adjustments  for  the  concomitant 
variables.  The  Statistical  Analysis  System  (SAS)  general  linear  model 
routine  will  present  the  data  in  the  same  format.  The  regression  routine 
is  used  instead  of  the  analysis  of  variance  routine  because  of  the  com- 
putational procedures  required  to  deal  with  unequal  sample  sizes. 

THEORY 

It  was  shown  in  Section  II  that  the  analysis  of  covariance  model 
can  be  written  in  matrix  form  as 

y = f n + c 


where  y is  a nxl  vector  of  responses , I is  an  nxp  matrix  containing  the 
classification  pattern  and  values  of  the  covariates , and  n is  a pxl  vector 
of  unknown  constants.  When  dealing  with  multivariate  data,  it  may  be 
helpful  to  partition  r and  p,  thereby  separating  design  and  covariatc 
information: 


y = [X  : 


+ e 


y - Xx  + ZB  + e. 


M 


Searle's  (11)  approach  is  more  compatible  with  the  example  to  be  presented 
and,  therefore,  will  be  followed. 


* 

■ 


Normal  Equat ions . In  solving  the  normal  equations  for  the  bgst 
estimates  of  [ and  fj,  a and  b will  serve  as  trial  estimators  for  t and  £, 
respectively . ~ 


X'X  X'z" 

a 

"x'y" 

Z'X  Z'Z 

b 

Z’y 

. ~ « 

- ~ - 

In  the  situation  with  more  than  one  observation  under  each  set  of  condi- 
tions, i.e.,  the  design  conditions,  X'X  will  not  be  of  full  rank  but  more 
than  likely  Z'Z  will  be  of  full  rank.  Let  (X'X)  be  a generalized  inverse 
of  X'X.  Using  the  first  equation  of  the  normal  equations,  a solution  for 
a in  terms  of  b may  be  bbtained. 

a = (X'X)-  [X'y  - X’Z  b] 

= (X'X)'  X'y  - (X’X)  - X’Z  b . 

Now  using  the  second  equation  of  the  normal  equations  in  solving  for  b 
after  substituting  for  a: 


Z’X  [(X'X)-  X'y  - (X’X)'  X’Z  b]  + Z’Z  b = Z'y 
b = (Z’[I  - X (X’X)'  X’]  ZK  V [I  - X (X'X)'  X’]  y . 

Let  H = I - X (X’X)'  X',  then 

b = (Z'HZ)-  Z'Hy 

Even  though  (X'X)  is  not  unique,  it  appears  in  b only  in  the  form 

X'  (X’X)  X which  is  unique  for  any  generalized  inverse  of  X'X.  Searle 
(11)  states  that  H is  both  symmetric  and  idempotent.  This  ensures  Z'HZ 
and  HZ  to  have  the  same  rank,  and  based  on  the  properties  of  X and  Z 
given  in  the  partitioned  equation,  will  guarantee  that  HZ  has  full  column 
rank  and  hence  Z'HZ  is  non-singular.  Therefore,  b is  a unique  solution 

and~is  the  b,1,u'e-  §•  Following  standard  notation,  let  g represent 

b;  8 = (Z'HZ)"1  Z’Hy. 
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Solving  for  the  Covariate  Coefficients.  Working  with  g in  the 
following  matrix  form 

6 = (Z*  [I  - X’(X'X)‘X]  Z}'  Z'  [I  - X’(X'X)'  X* ] y 

may  load  to  difficult  calculations,  so  a better  computational  method  will 
now  be  sought.  Considering  the  following  part  of  the  above  relation, 

1 - X’fX’  X)"X\ 

it  can  be  seen  that  the  identity  matrix  can  be  identified  with  a total 
amount  of  variation  and  the  term 

X’(X'X)'X' 

can  be  identified  with  some  other  amount  of  variation.  The  two  parts 
together  form  a difference  or  residual  which  is  idempotent.  Looking  at., 
the  pre-  and  post-multipliers  of  the  residual,  one  is  able  to  see  that 
consists  of  the  inverse  sum  of  squares  of  the  covariate  values  and 
the  sum  of  products  of  the  covariate  and  response.  Let  R be  the  matrix 
of  residuals  for  the  covariate  terms,  then 

g = (R’  R)'1  R'  y . 

If  one,  so  to  speak,  takes  a step  backward  and  expresses  the  relationship 
as 

R’  R f = R'  y, 

then  the  . values  may  be  easily  found.  The  above  matrix  may  now  be 
expressed  in  equat ions  as 


*7  "T 
- 1*- 1 

Bi  + 

LZiZ2 

S 2 + 

• • • + 

EZlZ 

K 

6 - 
K 

E_ 

Z ,y 

A 

A 

A 

z 2 Z 1 

Si  + 

ez2z2 

g2  + 

• • • -f 

ez2z 

K 

EZ2y 

A 

/. 

v« 

Bi  + 

%Z2 

B t + 

• • • + 

L7  7 

z z 

Y K 

0 - 

K 

z<y 

3C 


where 


CLi  l 


f 


E7  z are  error  or  residual  entries  in  a covariate  analysis  table.  The 

i j 

solution  vector  of  the  normal  equations  in  a covariate  model  then  becomes 


(X'X)'  X'y  - (X'X)‘  X'Z  B 


(Z'HZ) " ZHy 


Analysis  of  Covariance  Table.  The  format  for  the  analysis  of 
covariance  table  is  basically  the  same  as  presented  in  Section  II. 
Modifications  will  be  necessary  due  to  the  hypothesis  to  be  tested, 
having  unequal  sample  sizes,  and  additional  independent  variables. 

In  covariance  analysis,  interest  usually  centers  on  making 
inferences  about  aspects  of  the  classification  part  of  the  model. 

For  the  case  under  consideration,  interest  is  on  whether  interaction 
is  important  in  the  model.  Considering  a randomized  block  design 
with  two  blocks,  four  treatments,  and  two  covariates,  a test  will 
be  made  to  see  if  the  full  model 


>' M ci  * Tj  * (eT)ij  * Vi  ‘ Vic  * V 

where  E(e)  = 0, 

V(e)  = a2, 

can  be  predicted  by  the  model 


V ■ pi  * Tj  * Vi  * Vk  * H- 


The  hypothesis  to  be  tested  is 


Ho:  (px)^j  = 0 for  all  ij 

Hi : (pT)ij  ^ 0 for  at  least  one  ij  combination. 

The  analysis  of  covariance  table  is  summarized  in  Table  7.  Table  8 
presents  the  equations  necessary  for  obtaining  Table  7.  The  equations 
are  expressed  in  terms  of  the  dummy  variables  w and  v.  Bww  represents 
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TABLE  7.  ANALYSIS  OF  COVARIANCE  TABLE  FOR  A RANDOMIZED  BLOCK  DESIGN  WITH  UNEQUAL  SAMPLE  SIZES 


TABLE  8.  EQUATIONS  FOR  TABLE  7 
Let  represent  any  variable  where  i = 1,  ’ r 

j = 1, 


1, 


S - Z (w..,)2  - CF 
ww  Ijk' 


ijk 

CF  = n. . .w. 2 

1 


H 


ww 


Z — [w. . ,]2  - CF 
ij  nij  V 


P = Z - 
ww 


I;'  . tw  . ]2  + Z q^i  - CF 

j n • J . • J • i 


B = E d>.  q.  (obtained  from  Doolittle  table) 

WW  -11 


T = E 4.  q. 
ww  ■ 


D = H 
ww  ww 


ww 


E = S - H 
ww  ww  ww 


Let  be  a variable  different  from  w^.^, 


wv 


Z w, v.. 


ijk  ijk 


CCF 


ijk 

CCF  * n . . . w...  v . . . 

H.  - Z w. . v. . - CCF 

"wv  ..  ij.  ij. 


wv 


Z w . v , + Z 4)-  q. 
j *J*  * J‘  i 1 1 


B 51  Z d>.q.  (obtained  from  Doolittle  table) 
wv  ^ 1 1 


L q. 
J 


wv  7 Tj  '‘j 


D-  - «wv  - P 


wv 


wv 


Ewv  " Swv  " ^wv 
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, t 


’ n. 


1 


r 


the  row  sum  of  squares  after  adjusting  for  column  effects,  and  Tw 
represents  the  column  sum  of  squares  after  adjusting  for  row  effects, 
hither  B^.  or  T^-  is  obtained  by  the  Doolittle  method  (13).  Tabic  9 
presents  the  required  format  for  employing  the  Doolittle  method 
for  determining  B^w  and  E^.  Table  12  presents  the  results  of  employ- 
ing the  Doolittle  method  to  an  example.  B^  is  calculated  by 


where  $.  and  q.  are  obtained  from  Table  12.  TKw  can  be  obtained  from 
the  following  relationship: 


Ew.2  + b,  „,  = £ w.  2+t 
• • j • ww  ; i • • ww 


n . 


i — 


l • 


and  Tw  can  be  obtained  from  the  following  relationship: 

£ w • v + B = £ w.  v.  + T 
j • J • • J • wv  ^ i..  i..  wv  ’ 

where  B = £ <fc.q. 
wv  iit  V1H1 


(10) 


(ID 


EXAMPLE  OF  A RANDOMIZED  BLOCK  DESIGN  WITH  UNEQUAL  SAMPLE  SIZES 

A researcher,  working  for  a well-known  organization,  wanted  to 
determine  some  penetration  properties  of  projectiles  with  various  nose 
shapes  against  armor  plating.  He  decided  on  four  nose  shapes  and  two 
types  of  armor  plating.  After  securing  the  four  types  of  projectiles, 
it  was  noticed  that  the  weight  of  the  projectiles  varied  by  shape.  His 
original  idea  was  to  eliminate  the  influence  of  projectile  weight  by- 
having  all  shapes  contain  the  same  mass.  Further,  he  knew  that  equal 
amounts  of  propellant  will  not  necessarily  give  the  same  velocity  to 
like  projectiles.  Not  wanting  the  influence  of  the  two  variables, 
weight  (Zj)  and  velocity  (Z2),  in  his  results,  the  data  was  reduced 
using  the  analysis  of  covariance  method. 

The  data  is  presented  in  Table  10.  Totals  for  the  raw  data  are 
presented  in  Table  11.  The  experimental  unit  is  the  projectile  mass 
and  is  subjected  to  four  shapes  (treatments) . The  response  variable 
is  the  weight  of  the  projectile  after  penetrating  the  armor  plating. 

By  using  the  equations  given  in  Table  8 and  the  values  given  in 
Table  11,  one  is  then  able  to  construct  the  analysis  of  covariance 
table  (Table  13).  The  Doolittle  values,  Bkw»  are  obtained  from  Table  12, 
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TABLE  10 . RAW  DATA  TABLE  FOR  EXAMPLE  2 


OBS  Metal  Shape 

1 A C 

A C 

A C 

A C 

A C 

A C 

A C 

A C 

A C 

A C 

A C 

A C 

A C 

A C 

A S 

A S 

A S 

A S 

A S 

A S 

A S 

A S 

A S 

A S 

A S 

A S 

A T 

A T 

A T 

A T 

A T 

A T 

A T 

A T 

A C 

A C 

A C 

A C 

A C 

A C 

A C 

A C 

A C 

A C 

A C 


113.8 

677 

113.7 

113.2 

589 

113.0 

114.0 

556 

114.0 

114.1 

880 

113.4 

112.9 

331 

112.9 

113.7 

319 

113.7 

113.2 

236 

113.2 

112.8 

458 

112.3 

112.8 

405 

112.7 

113. S 

589 

113.5 

113.9 

570 

113.9 

114.1 

557 

114.1 

114.1 

529 

114.0 

113.6 

512 

113.2 

117.5 

965 

116.8 

116.8 

993 

116.1 

118.5 

959 

118.0 

117.4 

853 

117.4 

116.7 

786 

116.5 

117.7 

704 

117.4 

118.3 

626 

118.2 

118.0 

604 

117.9 

117.7 

564 

117.7 

117.2 

431 

117.1 

118.0 

371 

117.9 

118.4 

316 

118.3 

111.2 

372 

110.5 

111.0 

365 

110.9 

110.7 

278 

110.7 

109.7 

414 

109.7 

109.2 

499 

109.1 

112.7 

565 

112.7 

114.9 

924 

114.8 

112.9 

857 

112.6 

111.1 

514 

111.1 

111.1 

10 

111.1 

111.5 

368 

111.4 

111.3 

356 

111.3 

110.9 

306 

110.9 

110.8 

845 

110.9 

110.4 

903 

110.1 

110.9 

905 

109.6 

110.5 

872 

10S.6 

111.9 

731 

111.9 

110.5 

700 

110.3 
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TABLE  10 . RAW  DATA  TABLE  FOR  EXAMPLE  2 (CONTINUED) 


OBS 

Metal 

Shape 

Z1 

Z2 

Y 

46 

A 

C 

110.3 

703 

110.3 

47 

A 

c 

109.3 

593 

109.3 

48 

A 

c 

107.8 

582 

107.7 

49 

S 

c 

113.7 

780 

82.9 

50 

S 

c 

114.0 

822 

83.7 

51 

S 

c 

114.0 

845 

85.5 

52 

S 

c 

113.5 

881 

87.0 

53 

S 

c 

113.8 

777 

72.5 

54 

s 

c 

113.3 

870 

97.2 

55 

s 

c 

112.2 

895 

96.1 

56 

s 

c 

112.3 

918 

97.9 

57 

s 

c 

112.3 

938 

96.0 

58 

s 

c 

112.2 

962 

98.1 

59 

s 

c 

112.3 

1016 

96.9 

60 

s 

c 

112.4 

1030 

99.0 

61 

s 

c 

112.6 

1091 

94.0 

62 

s 

c 

112.1 

1104 

93.5 

63 

s 

s 

117.8 

871 

85.6 

64 

s 

s 

116.7 

925 

111.9 

65 

s 

s 

117.4 

926 

91.3 

66 

s 

s 

116.9 

957 

94.5 

67 

s 

s 

117.0 

980 

94.9 

68 

s 

s 

117.0 

1002 

93.4 

69 

s 

s 

117.5 

870 

82.6 

70 

s 

s 

117.1 

871 

84.9 

71 

s 

s 

117.7 

833 

83.1 

72 

s 

s 

118.3 

802 

78.2 

73 

s 

s 

117.7 

783 

73.0 

74 

s 

T 

113.5 

943 

92.5 

75 

s 

T 

113.3 

888 

78.3 

76 

s 

T 

113.0 

896 

77.1 

77 

s 

T 

114.2 

859 

68.7 

78 

s 

C 

114.4 

955 

90.4 

79 

s 

C 

113.8 

862 

86.9 

80 

s 

C 

113.9 

956 

92.3 

81 

s 

C 

114.1 

871 

84.6 

82 

s 

C 

111.3 

854 

66.1 

83 

s 

C 

111.2 

880 

70.9 

84 

s 

C 

110.9 

916 

80.0 

85 

s 

c 

111.0 

944 

84.9 

86 

s 

c 

112.3 

991 

94.4 

87 

s 

c 

114.8 

825 

64.3 

88 

s 

c 

110.4 

849 

92.0 
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TABLE  10.  RAW  DATA  TAB  IT  POR  EXAMPLE  2 (CONCLUDED) 


r 


OBS 

Metal 

Shape 

Z1 

Z2 

Y 

89 

S 

C 

114.7 

926 

106.3 

90 

S 

C 

113.7 

934 

105.7 

91 

S 

C 

112.7 

1047 

107.7 

92 

S 

c 

112.3 

1127 

108.1 

93 

S 

c 

113.0 

1143 

104.1 

94 

s 

c 

114.3 

1112 

110.0 

95 

s 

c 

103.8 

982 

96.3 

p 


TABLE  11.  TABLE  OF  DATA  TOTALS 


SHAPE 


M C S T 0 


E A l5 

12 

lit 

48 

TABLE  FOR  n 

T S l'L 

A 

11 

1* 

16  | 

1*7 

L 28 

23 

12 

32 

95 

M 

E 

T 

A 

L 

M 

E 

T 

A 

L 


M 

E 

T 

A 

L 


SHAPE 

C S T 0 


A 1589.7  ! 

1412.2 

892.3 

15W.3 

5UU2 . 5 

s 1580.7 

1291.1 

lt5lt 

2022.6 

53it8.lt 

3170.L  2703.3 

SHAPE 

C S 

13i*6.3 

l 

T 

3570.9 

0 

10790.9 

281tlt2 

A 7208 

8172 

U27U 

8788 

s 12929 

9620 

3586 

17171* 

1*3509 

20137 

s 

c 

17992 

1HAPE 

s 

7660 

T 

259^2 

0 

71951 

1587.6 

lU09.3 

891 

1541*. 5 

5432.4 

1280.3 

973.lt 

316.6 

1645 

4215.3 

28^7-9 

2382.7 

1207.6 

3189.5 

9647.7 

TABLE  FOR  Z 


TABLE  FOR  Z 


TABLE  FOR  y 


Z Zx  Z2  = 8,178,973.4 
ijk 

I Zj  y = 1,095,882.6 
ijk 


S Z2  y = 7,160,083.4 
ij 


TABLE  12.  DOOLITTLE  TABLE  FOR  EXAMFLE 


TABLE  13.  ANALYSIS  OF  COVARIANCE  TABLE  FOR  EXAMPLE  2 


47 


J 


and  Tvw  and  T^-y  values  are  obtained  by  employing  equations  (10)  and  (11). 
It  is  much  easier  to  show  how  Bwv  is  calculated  than  to  try  to  explain. 
Refer  to  Table  12  and  the  Zi  and  Z2  columns. 


JZiZ; 


u1.**  qi' 

= (.2774706236)  (320.0904241) 

+ (-.5432734036) (-7612.005434) 

= 4,224.21579 

After  obtaining  the  sum  of  the  products,  one  may  now  solve  for  the 
concomitant  coefficients  by: 

/s  /\ 

E Bi  + E 62  = E 
Z1Z1  Ztf 


and 


168.050816!  + 1022.731286a  = -13.97642 


E 61  + E 82  = E 
ziz2  z2z2  2 z2y 


1022.731286i  + 2,300,006.5310262  = 32,758.98067 


thus  obtaining 

A /-» 

6!  = -.17031  and  62  = .01432 

The  comparison  of  the  test  statistic  U to  a tabulated  F (2,  85)  at  the 
95  percent  level  indicates  that  the  Null  Hypothesis  is  not  rejected.  The 
interaction  term  need  not  be  considered  in  the  building  of  a predictive 
model.  Table  14  contains  the  means  for  any  comparisons  that  one  may 
want  to  make. 

If  one  wished  to  pursue  the  problem  further,  a test  for  treatments 
and  blocks  may  be  made  and  corresponding  adjusted  means  may  be  calculated. 

Table  14  contains  the  unadjusted  and  adjusted  means  for  the  response 

variable. 
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SECTION  IV 


COVARI ANCE  ANALYSIS  AS  A TECHNIQUE  FOR  ANALYZING 
INCOMPLETE  DATA 


INTRODUCTION 

In  Section  I,  it  was  stated  that  one  of  the  principal  uses  of 
covariance  analysis  was  to  analyze  data  when  some  responses  are  missing. 
The  covariance  analysis  teclmique  is  an  alternative  method  of  predicting 
missing  values  to  that  described  by  Sncdecor  and  Cochran  (12)  under 
missing  data.  Both  techniques  apply  to  data  containing  missing  responses 
that  are  to  be  analyzed  by  the  analysis  of  variance  method.  The  covari- 
ance missing  data  technique  presented  here  does  not  apply  to  predicting 
missing  values  for  analysis  of  covariance  data,  responses  or  covariates. 

M.  S.  Bartlett  introduced  the  concept  of  using  covariance  analysis 
on  missing  data.  The  reason  why  an  alternative  method  was  sought  was 
because  no  general  algorithm  exists  for  dealing  with  missing  values. 
Special  formulae  exist  for  each  randomization  scheme,  and  adjusting  for 
the  bias  becomes  tedious. 

This  section  is  based  on  an  article  by  Coons  (4)  in  which  the 
author  presents  a general  method  to  the  problem  of  missing  data  and  also 
demonstrates  the  case  with  which  exact  tests  of  significance  may  be 
obtained.  The  tests  are  exact  when  the  errors  are  assumed  to  be  inde- 
pendent and  normally  distributed. 

PROPERTIES  FOR  JUSTIFYING  TIE  COMPUTATIONAL  PROCEDURES 

The  following  properties  are  quoted  from  Coons’  article  and  are 
attributed  to  various  individuals.  The  article  indicated  that  Property  1 
is  attributed  to  Fisher,  Property  2 is  implicitly  assumed  by  several 
authors,  Property  3 to  Bartlett,  and  Properties  4,  5,  and  6 to  Kempthome. 

1.  If  an  analysis  of  variance  is  made  with  symbols 
3i»  $2 , •••Bq  in  the  place  of  missing  observations, 
then  the  best  linear  unbiased  estimates  of  the  missing 
observations  are  the  quantities  gj,  g2,  •••,  gq  which 
minimize  the  error  sum  of  squares. 

2.  Given  that,  with  full  data  (yly  y2,  •••,  yn) , the 
best  linear  unbiased  estimate  of  some  linear  function 
of  the  parameters  is  vqyi  + v2y2  + •••  + vnyn,  then 
the  best  estimate  of  that  function  with  missing  data 
is  obtained  by  replacing  the  missing  y's  with  the 
missing  value  estimates. 
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3.  Let  the  data  be  observed  data  where  obtained 
and  zero  where  missing.  Introduce  a concomi- 
tant variable  Xm(m  = 1 •••  q)  corresponding  to 
the  mth  missing  observation,  let  Xm  take  the 
value  -v  for  the  mth  missing  observation  and 
zero  for  all  others,  missing  or  not.  If  the 
error  partial  regression  coefficients  obtained 
from  an  analysis  of  covariance  are  denoted  by 
§i,  B2,  •••»  Bq,  then  v^,  vg2,  • •.,  vBq  are 
the  best  linear  unbiased  estimates  of  the 
missing  observations. 

4.  Estimates  of  functions  of  data  with  missing 
observations,  and  variances  and  covariances  of 
these  estimates  may  be  obtained  by  the  routine 
application  of  formulae  for  adjusted  means  in 
the  analysis  of  covariance;  i.e.,  by  regarding 
the  zero  yields  supplied  in  the  analysis  of 
covariance  procedure  as  having  variances  of 

o2.  The  above  statement  applies  to  functions  of 
the  augmented  data;  the  variance  of  a missing 
observation  per  se  is  given  by  statement  5 
following . 

5.  Denote  the  error  sum  of  squares  of  Xj  by 
Eii  and  the  error  sum  of  products  of  Xi  and 
by  Ei j . Then  the  variance  of  the  ith  missingJ 
value  estimate  is  fv2uii  - l)o2,  and  the  covariance 
of  the  and  jth  missing  value  estimates  is 
v2uijo2,  when 
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6.  The  sum  of  squares  for  treatments  obtained 
by  analyzing  the  data  augmented  by  the  missing 
value  estimate  is  always  greater  than  or  equal 
to  the  exact  sum  of  squares  for  treatment. 

COVARIANCE  TECHNIQUE  APPLIED  TO  ONE  MISSING  OBSERVATION 

The  covariance  technique  will  be  discussed  as  the  following  problem 
is  being  worked. 
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Due  to  the  world  hunger  problem,  it  has  become  important  to  try 
to  recover  farm  land  in  countries  where  herbicides  were  used  during 
recent  military  actions.  Neutralizers  were  added  to  the  soil  samples 
collected  from  various  regions.  Grain  crops  were  then  planted  in  the 
treated  soils  to  determine  how  much  of  the  toxin  in  herbicides  would 
be  passed  on  to  humans  and  animals.  It  was  decided  to  randomize  the 
experiment  in  a 4x4  Latin  square  and  take  two  observations  per  condi- 
tion. The  results  of  treating  one  herbicide  is  given  in  Table  15. 

The  experimental  unit  is  a pot  containing  a plant.  Applying  the 
covariance  technique,  the  covariate,  z,  would  take  the  value  zero  for 
all  responses,  y,  not  missing  and  -n  with  the  missing  response.  There 
are  32  observations  including  the  missing  value,  so  n = 32.  Other 
authors  have  suggested  that  any  convenient  value  may  be  assigned  as  the 
covariate  to  the  missing  response  because  z and  y are  unrelated.  Using 
-n  simplifies  calculations  for  any  line  entry  in  the  analysis  of  covari- 
ance table  for  the  covariate  sum  of  squares  is  simply  n x (degrees  of 
freedom).  The  missing  response  takes  the  value  zero,  as  stated  in 
property  3,  and  the  non-missing  responses  retain  their  values.  Table  16 
shows  how  the  technique  is  applied. 

With  a single  degree  of  freedom,  the  line  entry  for  each  of  the 
cross  product  sum  of  squares  is 

Xi  - X2 

where  X:  is  the  total  of  Y observations  for  the  effect  level  which  does 
not  contain  the  missing  observation,  and  X2  is  the  total  of  Y observations 
for  the  effect  level  which  contains  the  missing  observation.  For  line 
entries  containing  more  than  one  degree  of  freedom, 


(xi, 


- X ). 
J-2 


Zzy  = Z 
i 

(See  Table  16.)  The  calculations  of  Zy2  are  as  usual  and  will  not  be 
shown.  g is  estimated^by  g.  An  estimate  of  the  missing  value  is  given 
by  Property  3 to  be  n g.  It  is  not  necessary  to  estimate  the  missing 
value  since  a complete  analysis  of  the  data  may  be  performed  with  the 
value  remaining  unknown.  The  covariance  technique  enables  one  to  make 
exact  tests  readily  w'ith  only  minor  supplementary  computations. 


An  approximate  test  of  significance  may  be  obtained  by  computing 
the  biased  sum  of  squares  which  is  equivalent  to  an  analysis  of  Y - gZ. 
Property  6 states  that  the  approximate  sum  of  squares  is  greater  than, 
or  equal  to,  the  exact  sum  of  squares.  Therefore,  any  approximate  mean 
square  which  is  not  significant  may  be  eliminated  from  consideration  and 
thereby  shorten  the  calculations . The  approximate  sum  of  squares  may  be 
computed  as 

A A 

Zy2  - 2g  Zzy  + B2  Iz2  . 
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TABLE  15.  DATA  TABLE  FOR  EXAMPLE  3 


SOIL  - SI  - Sand 

- S2  - Sand  + Herbicide 

- S3  - Clay 

- S4  - Clay  + Herbicide 

PLANT  - PI  - Wheat  NEUTRALIZERS  - A 

- P2  - Rice  - B 

- P3  - Grass  - C 

- P4  - Barley  - D - Nothing 

The  response  is  the  average  amount  of  herbicide  toxin  found  in 
the  grains  of  each  plant,  measured  in  count  per  million. 
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TABU.  16.  COMPUTATIONAL  TABU  FOR  EXAMPLE  3 


z 

y 

z 

0 

D 

105 

0 

-32 

100 

0 

0 

B 

2 

0 

0 

7 

0 

0 

A 

89 

0 

0 

91 

0 

0 

C 

52 

0 

0 

77 

0 

P3 

y z 


52  0 
61  0 


2 0 
0 0 


3 0 
7 0 


92  0 

90  0 


P4 

y z 


12  0 
9 0 


93  0 

91  0 


EC  = 493 
ZD  = 851 


Each  Zz  line  entry  = n X (degrees  of  freedom) 
Total  Tz2  = (32)  (31)  = 992 


Zzy  line  entry  = Z (Y.  - Y ) 
i 1 2 


Soil  Zzy  = (553  - 430)  + (516  - 430)  + (540  - 430) 


6 = Ezy/Ezz  = 2030/704  =2.88 


Missing  Value  Estimated  = ng  = 92 
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So  for  soil, 

Soil  Approx  SS  = 1148.0937  + (2.88) 2 (96)  - 2 (2.88)  (319) 

= 106.9161 

The  approximate  mean  squares  are  obtained  by  dividing  the  approxi- 
mate sum  of  squares  by  the  appropriate  degrees  of  freedom.  All  of  the 
above  calculations  are  summarized  in  the  analysis  of  covariance  table 
(Table  17).  The  adjusted  sum  of  squares  is  obtained  in  the  usual  way. 
Exact  test  of  significance  may  now  be  made  on  the  variations  of  interest. 
Estimates  of  treatment  means  must  be  adjusted  to  the  value  zero  of  the 
covariate  variable  instead  of  to  the  covariate  average,  i.e., 

ADJ  Y = Y - 6 Z 

where  1 is  the  average  of  the  number  of  responses  making  up  Y.  Since 
Treatment  A contained  the  missing  value, 

ADJ  Ya  = 79.63  - (2.88)  (-4)  = 91.15. 


The  variance  is  given  by 

V (ADJ  Ya)  = o2/n  + (Z)2o2/E:oc 

where  a2  is  estimated  by  s2  . Therefore, 

1 y-x 

V (ADJ  Y ) = 32.79  [1/8  + (-4)  7 704] 

= 4.84. 


COVARIANCE  TECHNIQUE  APPLIED  TO  MORE  THAN  ONE  MISSING  OBSERVATION 

The  application  of  the  technique  will  be  discussed  as  the  following 
problem  is  being  worked. 

A research  laboratory  received  four  new  growth  chambers.  Before 
putting  them  into  use,  it  was  decided  to  conduct  a trial  experiment  to 
determine  the  variations  within  and  among  each  chamber.  Since  all 
chambers  were  large,  it  was  decided  to  divide  each  into  three  horizontal 
positions  and  two  vertical  positions  to  determine  if  location  had  any 
effect  on  plant  growth.  Six  pots  containing  similar  seed,  soil,  and 
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nutrients  were  randomly  placed  within  each  chamber.  The  experiment  was 
replicated  twice  for  two  months  at  a time.  The  response,  plant  height 
in  centimeters,  was  to  be  analyzed  using  a split  unit  analysis  in  strips. 

Table  18  contains  the  raw  data,  augmented  covariates,  and  the 
totals  necessary  for  computations.  As  before,  the  number  of  y observa- 
tions, including  those  missing,  is  equal  to  n.  The  value  zero  is 
assigned  to  each  missing  y observation  and  to  each  covariate  where  the 
y observation  is  not  missing.  For  covariate  values  associated  with 
missing  observations,  the  value  of  -n  is  assigned.  When  more  than  one 
observation  is  missing,  a multiple  covariance  analysis  is  needed.  There 
will  be  one  covariate  for  each  missing  value. 

The  computations  for  Ez*  and  Ez \ are  the  same  as  before.  The  one 
column  entry  Ez?  will  suffice  for  Ez*  and  Ez*.  Two  situations  may  occur 
in  computing  EziZ2: 

1.  When  Zj  and  Z2  occur  in  the  same  level,  the  results  are 
the  same  as  for  Ez? 

EzjZ2  = n x (degrees  of  freedom). 

2.  When  z^  and  Zj  occur  in  different  levels, 

(i)  for  no  interact inn  levels, 

EziZj  = -nr, 

where  r depends  upon  the  hierarchy  classification. 
With  no  hierarchy,  r = 1. 

(ii)  for  levels  in  which  there  is  interaction,  the  main 
effects  and  lower  order  interactions  must  be  subtracted  from 


The  author  was  unable  to  obtain  Coons ' results  when  following  his 
computational  methods,  so  the  usual  method  for  obtaining  sum  of  squares 
was  employed.  Table  19  contains  an  example  of  the  computations  for  the 
cross  products  needed  in  building  the  analysis  of  covariance  table 
(Table  20) . The  line  entries  for  z y cross  product  sum  of  squares  is 
obtained  as  before, 


Ezy 
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h2 

H, 

Totals 


TABLE  18.  TABLE  OF  TOTALS  FOR  EXAMPLE  4 
REP  l 


Chanter  1 


Chamber  2 


Chamber  3 


V 

1 

V 

2 

z ' 

Z 

2 

V 

1 

V 

2 

z 

1 

z 

2 

V 

1 

V 

2 

z 

l 

z 

2 

V 

1 

V 

2 

23 

21 

0 

0 

20 

18 

0 

0 

21 

19 

0 

0 

20 

25 

19 

17 

0 

0 

16 

14 

0 

0 

17 

15 

0 

0 

21 

19 

8 

6 

0 

0 

5 

3 

0 

0 

6 

4 

0 

0 

10 

8 

50 

44 

0 

0 

41 

35 

0 

0 

44 

38 

. 0 

0 

51 

52 

Chamber  4 


Totals 


138  0 

50  0 

355  0 


Chamber  1 


Chamber  2 


Chamber  3 


Chamber  4 


Z Z 
1 2 


21  0 0 21 


Totals  | 52  | 44  | 0 | 0 | 44  [ 40  J 0 | 0 | 43  J 19  |-48  | 0 | 44  ] 52  ] 0 j -48  [ 338  ] -48 

REP  X CHAMBER 

CHM  1 Z.  zl  ) CHM  2 Z,  Z,  I CHM  3 Z.  Z,  I CHM  4 Z.  Z.  I Tota 


Totals 


Z 


: 

1 2 

76 

0 

0 

82 

0 

0 

103 

0 

0 

84 

0 

0 

62 

-48 

0 

96 

0 

-48 

TABLE  18.  TABLE  OF  TOTALS  FOR  EXAMPLE  4 fCONCIJJDED) 


MAIN  A UNIT  X CHAMBER  X REP 


CHM  1 

h 

z2 

CHM  2 

Z i 

z2 

CHM  3 

z> 

z2 

CHM  4 

Zi 

Z2 

2 

44 

0 

0 

38 

0 

0 ' 

40 

0 

0 

45 

0 

0 

36 

0 

0 

30 

0 

0 

32 

0 

0 

40 

0 

0 

14 

0 

0 

8 

0 

0 

10 

0 

0 

18 

0 

0 

45 

0 

0 

40 

0 

0 

38 

0 

0 

48 

0 

0 

34 

0 

0 

29 

0 

0 

16 

-48 

0 

35 

0 

0 

17 

0 

0 

15 

0 

0 

8 

0 

0 

13 

0 

-48 

0 144  -48  0 


MAIN  B UNIT  X CHAMBER  X REP 


CHM  1 

Zi 

Z2 

CHM  2 

Zi 

Z2 

CHM  3 

Zi 

Z2 

CHM  4 

Zi 

z2 

0 

0 

41 

0 

0 

44 

0 

0 

51 

0 

0 

44 

0 

0 

35 

0 

0 

38 

0 

0 

52 

0 

0 

52 

0 

0 

44 

0 

0 

43 

0 

0 

44 

0 

-48 

44 

0 

0 

40 

0 

0 

19 

-48 

0 

52 

0 

0 

TABLE  19.  COMPUTATIONAL  TABLE  FOR  EXAMPLE  4 


The  Z i Z 2 cross  product  sum  of  squares  is  obtained  by  using  the  appropriate 
cross  product  table.  Using  the  main  A unit  X chamber  X rep  table,  the  main 
A unit  analysis  is  obtained: 


1 


LZj  Z2  - j t(0) CO)  + •••  + (-48) (0) 


+ • • • + 


(U) (-48)]  - ^ (-48) (-48) 


= -48. 


The  Zjy  cross  product  sum  of  squares  for  main  B unit  is 


Iz 


iy  = z (x-  - x.  ) 


(50  - 19)  + (44  - 19)  + 
255. 


+ (52  - 19)  - Reps  - Chambers 


Estimates  of  the  missing  values  are  as  follows: 


Missing 

Main  A Unit 

Main  B Unit 

Subunit  AB 

13.44 

21.12 

7.2 

z2 

OO 

12 

18.24 

TABLE  20.  ANALYSIS  OF  COVARIANCE  TABLE  FOR  EXAMPLE  4 
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Adjusted 
Mean  Square 

2.0626 

1 

& 

rH 

00 

© 

1 

I 

-0.67245 
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of  Squares 

35.0642 

10.6422 

-4.0347 

1 

4 

•i 

"l 

< 

3 

r 

21 

1 

13 

1 

1 

4 

« 

i 

< 

u 
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<4 

132.3721 

1734.0914 

37.3803 

1 

-2.9225 

N 
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o 

1 

LO 

CM 

■ 

■ 

S 

< CQ, 

oo 

CM 

1 

■ 

1 

LO 

rH 

N 

>s 

2331.8125 

2048.3125 

6.0208 

164.5625 

1767.125 

CM 

o 

vO 

o 

fH 

rH 

175.2297 

LO 

r** 

00 

rH 

CM 

133.0422 

108.2703 

23.625 

84.6453 

M “ 

M 

693 

381 

17 

-103 

a 

CT> 

00 

m 

r*» 



-45 

120 

237 

* 

201 

*8 

a 

>5 

N 

693 

309 

t". 

fH 

rH 

iH 

ro 

v£> 

238 

255 

i n 

h- 

210 

129 

■O’ 

PM 

00 

<35  “ 

N 
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N 
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■O’ 
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oo 

1 

48 

OO 

i 

-48 

o 

00 

1 

oo 
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■ 

00 

■*■ 

00 

1 

O 

"•H 

N 

2256 

1104 

OO 

144 

8 

864 

S28 

00 
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624 

96 

538 
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<M 

rH 

K> 

oo 

fH 
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rH 

B 
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CM 

B 
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Total 
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A Analysis 

Reps 

12 

Horizontal 

Error  A 

Main  Obit 
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Vertical 

Error  B 

A 

R 

1? 

35 

s 

Error  C 
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For  the  y2  sum  of  squares,  one  follows  the  same  procedures  as  in  an 
analysis  of  variance  table.  Note  that,  for  the  split  unit  in  strips  with 
two  main  units,  an  adjustment  is  made  in  calculating  the  main  unit  sum  of 
squares.  Looking  at  the  main  A unit  X chamber  X reps  table  and  the  main 
B unit  X chamber  X reps  table,  the  entries'  chambers  and  reps  are  included 
in  both  main  unit  calculations.  Since  accounting  for  them  once,  they  must 
be  removed  from  the  interaction  unit.  In  this  example,  chamber  and  reps 
sum  of  squares  were  subtracted  out  in  the  ZjZ 2 column. 

s~ 

Values  for  & are  obtained  by  solving  the  appropriate  set  of  equations 
as  explained  in  Section  III.  Missing  values  are  estimated  by 

A s\ 

ym  = n8E  = n X (£  associated  with  the 

missing  observation  for 
the  particular  level) . 

Obtaining  the  approximate  sum  of  squares  may  again  help  reduce 
computations.  For  multivariate  data,  the  approximate  sum  of  squares 
is  confuted  by 

[ Y - ZB  ]'  [ Y - zi  ] 


Y'  Y - Y-  zB  - B'z'  Y + B z'zB  , 
and  for  this  example, 

By2  - 2 B i Eziy  - 2B2  Bz2y  + B?  lz\  + B2  Ez|  . 


The  adjusted  sum  of  squares  for  the  error  terms  is  obtained  by, 

A A 

By2  - Bi  Eziy  * S2  Ez2y 

A 

where  the  B;  and  Ez.  correspond  to  the  appropriate  level.  When  testing 
line  entries,  follow'the  normal  covariance  analysis  procedure  and  use  the 
appropriate  error  term. 


SECTION  V 


COVARIANCE  ANALYSIS  FOR  NON -PARAMETRIC  DATA 


INTRODUCTION 

Bross  (2)  put  forth  a non-parametric  procedure  for  handling  data 
by  means  of  covariance  analysis.  The  procedure,  the  Covariable  Adjusted 
Sign  Test  (COVAST) , is  designed  for  detecting  differences  between  two 
treatments  having  binary  responses  with  a single  covariate.  The  assump- 
tions are : 

(a)  The  covariable  and  response  have  a monotone  relationship. 

(b)  The  observations  are  independent. 

(c)  The  measurement  scale  of  the  covariate  is  at  least  ordinal . 

In  practice,  subjects  are  divided  into  two  subsets  such  that  the  indi- 
viduals in  each  set  possess  covariate  values  which  are  representative 
of  the  covariate  range.  Treatments  are  applied  to  subjects  in  each  subset, 
and  it  is  expected  that  the  portion  of  subjects  responding  to  a treatment 
is  0.5.  Ury  (9)  recognized  that  the  expected  portion  in  each  subset 
may  not  be  0.5  and  expanded  the  work  of  Bross  to  include  these  cases. 

Quade  (6)  develops  a procedure  called  "Rank  Analysis  of  Covariance" 
designed  for  handling  treatment  differences  in  responses  measured  on 
at  least  an  ordinal  scale  and  having  one  or  more  covariates.  The  procedure 
compares  to  a completely  randomized  analysis  of  covariance.  He  also  dis- 
cusses other  methods  developed  along  this  line.  Puri  and  Sen  (5) 
develop  a theoretical  approach  to  the  completely  randomized  case. 

The  procedures  of  Bross  and  Ury  will  be  presented  in  this  section 
along  with  an  example  using  real  data.  The  other  procedures  will  not 
be  included  in  this  report. 

THE  COVAST  TEST 

Rationale 


Suppose  one  is  faced  with  a situation  where  the  result  of  an  event  is 
a binomial  response.  Let  this  event  be  associated  with  a variable,  meas- 
ured on  an  ordinal  scale  or  better,  which  will  have  a changing  influence 
on  the  response  of  the  event.  Consider  the  following  illustration: 
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Covariate  Scale 


a 


b 


Binomial  Response  — — 

Always  "0"  Mixed  Always  "1" 

At  point  a and  below  on  the  covariate  measurement  scale,  the  response  is 
always  the  same.  At  point  b and  above,  the  response  is  always  the  same 
but  different  from  the  response  at  a.  For  the  interval  (a,b),  the  responses 
are  mixed.  For  example,  babies  of  a certain  weight  (covariate)  may  live  or 
die  (event)  when  afflicted  with  a certain  disease.  Another  example  may  be 
combustion  or  non- combust  ion  (event)  at  a given  temperature  (covariate). 

One  may  then  be  interested  in  determining  if  there  is  a statisti- 
cally significant  difference  between  two  treatments  under  the  situation 
being  considered.  A treatment  may  be  a drug  cure  to  the  disease  or  an 
ignitor  for  stimulating  combustion.  To  see  how  the  covariate  is  taken 
into  account  for  comparison  tests,  one  needs  to  assume  the  following: 

1.  That  one  treatment  is  better  than  the  other. 

2.  That  the  chance  for  an  improvement  increases  either  as  the 
covariate  increases  or  as  it  decreases. 

The  words  "better"  and  "an  improvement"  may  be  understood  in  terms  of 
ordering  the  observed  values  of  the  covariate  from  values  less  that  a 
to  greater  than  b where  the  response  at  the  a end  of  the  scale  repreT 
sents  an  unfavorable  response.  Suppose  two  Tgnitors,  H and  M,  are  being 
compared  to  determine  whether  M is  significantly  better  than  H for 
starting  fires.  If  the  outcomes  are  the  same  for  both  treatments  regard- 
less of  temperature,  no  evidence  is  provided  for  a clear-cut  superiority . 

If  one  treatment  started  fires  at  high  temperatures  and  the  other  did 
not  start  a fire  at  low  temperatures,  the  results  might  be  attributed 
to  the  initial  conditions  rather  than  to  the  treatments. 

However,  if  one  treatment  starts  fires  at  low  temperatures  and 
the  other  treatment  does  not  start  fires  at  high  temperatures,  then  this 
would  be  evidence  (but  not  conclusive)  for  an  advantage  to  the  treatment 
which  does  start  fires.  One  can  compare  the  performance  of  the  two 
treatments  by  making  pairwise  comparisons . The  comparisons  would  be 
made  on  the  basis  of  the  following: 

1.  One  of  the  ignitors  starts  a fire,  and 

2.  The  fire  was  started  at  a lower  temperature. 

In  order  to  show  a definite  advantage  for  M,  it  must  be  shown  that  M's 
ability  to  start  fires  is  greater  than  that  expected  from  sampling 
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variation  alone.  This  is  accomplished  by  counting  the  number  of  instances 
where  M starts  fires  at  a lower  temperature  and  H does  not  start  fires  at 
higher  temperatures . Let  these  situations  be  designated  "non -inversions'' 
(NI) . The  opposite  situation  would  be  to  count  the  number  of  instances 
where  H started  fires  at  lower  temperatures,  and  M does  not  start  fires 
at  higher  temperatures.  Let  these  situations  be  designated  "inversions" 
(I) . NI  and  I may  now  be  compared,  and  if  the  value  of  NI  is  found  to 
be  greater  than  its  expectation,  then  one  would  have  direct  evidence 
of  an  advantage  for  M. 

Hypothesis 

Let  i = 0 if  H is  used;  j = 0 if  no  fire 

= 1 if  M is  used;  = 1 if  fire 

and  let  Njj  be  the  number  of  observations  in  the  ith  series  having  the 
jth  response.  Let  I kg  be  the  number  of  inversions  where  fires  started 
in  the  k^h  series  are  compared  to  no-fires  in  the  gth  series: 

k = 0 if  H is  used;  g = 0 if  H is  used 

1 if  M is  used;  = 1 if  M is  used. 

The  covariate  complicates  the  hypothesis  statement  because  of  the 
fact  that  it  determines  the  ordering  which  affects  the  inversions.  As 
a result  of  this  complication,  we  must  test  a compound  hypothesis. 

First,  consider  the  hypothesis  by  parts  and  then  as  combined 

H0.  : the  two  treatments  are  equally  effective 

H.0  : the  covariable  is  irrelevant 

H0o  : neither  the  treatments  nor  the  covariate  are  relevant 
to  the  event. 

The  respective  alternative  hypothesis  may  be  stated  as  follows: 

: the  two  treatments  are  not  equally  effective 
H.!  : the  covariable  is  relevant 
H ! j : the  treatment  or  the  covariate  is  relevant. 

The  above  compound  hypotheses,  Hoo  and  Hu,  would  be  used  for  a two- 
tailed  test.  The  following  compound  hypotheses  are  used  for  a one-tailed 
test. 
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H0  o : Treatment  1 is  equivalent  to  treatment  2,  and  the 
covariable  is  not  important. 

Hli  : Treatment  1 is  less  (greater)  than  treatment  2,  or  the 
covariable  is  important  or  both. 

Less  (greater)  may  be  interpreted  as  being  better  or  an  improvement.  The 
way  the  test  statistic  is  taken  will  determine  if  the  hypothesis  is  for 
an  upper  or  lower  tailed  test. 

Test  Statistic 

Under  II0.  and  1 1 0 o » with  the  statement  of  no  treatment  difference, 
one  would  expect  the  portion  of  events  occurring  for  the  kth  and  gth 
series  to  be  the  same.  Let  r be  the  proportion  of  events  favoring  the 
kth  series  and  (1-r)  be  the  proportion  of  events  favoring  the  gt^  series. 
Therefore,  their  expected  proportions  would  be 

E (r)  = E (1-r)  = 0.5 

which  implies  that  we  expect  Im  = I o i • The  alternative  hypothesis,  Hu, 
is  supported  when  I10  / I o i > and  the  alternative  hypothesis,  H'u,  is  supported 
when  I io  > I0i  or  Ij0  < I o i depending  upon  the  upper  or  lower  one-tailed 
test . 

Bross  states  that  Mann  and  Whitney  (1947)  proved  that,  given  the 
observed  values  N00,  N0j,  N10,  and  Nu  along  with  H00,  I10,  and  I01  have 
the  following  expected  values  (E)  and  variances  (V) : 

E (I10)  = Nu  N00/2 

V (I10)  = Nn  N00  (Nu  + No o + 1)/12 
E (I01)  = N i o N o ,/ 2 

V (I0i)  = N10  N0 1 (N i o + N01  + 1)/12 

where  Njj  is  the  number  of  observations  in  the  i^1  series  having  the 
response . 

I io  and  Ioi  involve  two  distinct  sets  of  data  and  are  therefore  condi- 
tionally independent  provided  the  original  observations  were  independent. 

So, 

E (Iio  * Io  i)  = (Nn  N00  * N i o N0i)/2 

V (I  io  - Io  i)  = [Nu  N00  (Nu  + N00  +1) 

+ N i o Nu  (N i o + N0i  - Dl/12. 
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A relationship  between  COVAST  and  the  chi-square  test  for  independence  is 
suggested  because  the  expected  value  of  I10  - I01  is  one-half  the  numer- 
ator of  the  short  cut  form  of  the  chi-square  test. 

Each  Nij  is,  in  reality,  a random  variable  having  a binomial 
distribution  with  mean  Ni.tr  and  variance  Ni.tr  (1  - tr)  where  tr  is  the 
probability  of  a fire,  and  the  marginal  totals  Ni.  = N,,  + N,0,  and 
No.  = N00  + N0i  are  fixed.  Thus,  the  expected  value  and  variance  of 
N^  under  H0o  is 


E*  (N. .)  = tr  N.  if  a success  occurs 
13  1. 

= (1-ir)  N^  if  a failure  occurs 
V*  (N..)  = E*  (N..  - trN.  )2  = N.  tr  (1  - tr) . 

Substituting  these  into  the  above  expectation: 

E*  [E  (I10  - I01)]  = [E*  (N11)  E*  (Noo)  - E*  (N,0)  E*  (N01)]/2 
= [tr  N,.  (1-tr)  N0.  - (1-tr)  N,.  tr  N0.]/2 

= 0 

12E*  [V  (I10  - Io  1) ] = E*  (Noo)  E*  (N„)2  + E*  (N„)  E*  (N00)2 

+ E*  (N„)  E*  (Noo)  + E*  (N0 1)  E*  (N10)2 
+ E*  (Nl0)  E*  (N01)2  + E*  (Nl0)  E*  (N01) 

= (1  - tr)  No.  E*  (N, ,)2+  trN,.  E*  (N00)2 

+ uNo.  E*  (N , 0) 2 + (1  - tr)  N„  E*  (N0,)2 
+ 2tr  (1  - it)  N,.  N0. 


Based  on  this  value  of  E*  [V  (I10  - Ioi)]»  Bross  argues  that  V (I10  - I0i) 
may  be  estimated  by 


(I10  + Iox)(N..  + 4) 

u 
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Hence,  the  statistic 


COVAST  = 


12  (Il0  - I o i) 2 
Clio  + I o i ) (N . . ♦ 4) 

12  UST 
N +4 


has  approximately  a chi-square  distribution  with  one  degree  of  freedom. 
COVAST  is  then  a variation  of  the  Uncorrected  Sign  Test  (UST)  and  in 
this  form  becomes  a test  statistic  for  a two-tailed  test  for  non-parametric 
covariance  analysis. 

Ury  (9)  proposes  a method  of  testing  a one-sided  hypothesis  for 
Bross’s  COVAST.  Ury  defines  an  r value  as  being  "the  proportion  of  com- 
parisons potentially  favoring  the  treatment"  considered  to  be  an 
improvement ; i . e . , 


Ni.  N0. 

where  T4  is  the  total  of  the  entries  of  column  4 in  a table  such  as 
Table  21.  After  ranking  the  treatments,  a count  is  made  to  see  how 
many  times  the  new  treatment  ranks  below  the  standard  treatment.  This 
count  is  made  for  each  subject  given  the  new  treatment.  The  expectations 
of  I io  and  X o i under  Hoo,  when  r is  considered,  becomes: 

E (I  i o)  = rNii  Nc o 
E (Ioi)  = (1-r)  N i o N0l  . 


For  a given  r,  r0,  the  following  conditional  expectations  hold: 

E*  E (lio  I r0)=7T  (l  — tO  rQ  N N0 


E*  E ^oi  I r0)  = tt  (1  - m)  (1  - re)  N,. 
E*  E (lio  + Ioi  I ro}  = (1  * *>  V 


E*  E " I0 


i I rfl)  = it  (1  - it)  (2r0  - l)  N,  _ N( 


TABLE  21.  TABULA! ION  DATA  FOR  EXAMPLE  5 


(1)  (2)  (3)  (4) 

TEMP  IGNITOR  RESULTS  r 


(5) 

(6) 

(7) 

(8) 

F 1 

M 

H 

M 

H 

N 

H 

M 

H 

20.2  M 


N 11 


21.4 
22.0 
22.0 

23.0 

25.0 

26.0 
26.0 
26.0 
26.8 
27.2 
28.0 
28.8 

29.0 

30.4 

32.0 
32.6 

33.0 

33.5 

34.0 

34.0 

35.0 

35.0 

37.0 
37.0 
37.0 


H 

M 

H 

H 

H 

M 

M 

M 

M 

H 

M 

M 

M 

H 

M 

M 

M 

M 

M 

H 

M 

H 

M 

H 

H 


H 

N 

F 

N 

F 

N 

F 

F 

N 

F 

F 

N 

N 

F 

F 

N 

N 

N 

F 

N 

N 

F 

F 

N 

F 


10 


7 

7 

7 

7 

6 

6 

6 

5 

5 

5 

5 

5 

4 

3 


9 3 

8 3 


8 


2 


2 

2 


7 

7 


2 


2 


2 


4 


2 


1 


1 


0 


0 0 
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TABLL  21.  TABULATION  DATA  FOB  EXAMPLE  5 (CONCLUDED) 


(1)  (2)  (3)  (4) 

TEMP  IGNITOR  RESULTS  r 


37.0  H F 0 0 

40.0  M F 0 0 

47.0  M F 0 0 

TOTAL  99  11  35  25  13 


N. . TABLE 
1J 


NO  FIRE  FIRE 
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When  r0  = 0.5,  E*  F.  (I  l0  - Ioi)  " 0 which  agrees  with  Bross.  Ury 
suggests  using  the  square  root  of  COVAST  for  the  one-sided  test. 


C 


12  <I„  - I„>! 

(»..  * M(i„  ♦ i„) 


dio  " ^i) 


12 

+ M(I10 


+ I 


Decision  Rule 


As  with  the  two-tailed  Sign  Test,  the  COVAST  test  statistic  for  the 
two-tailed  alternative  would  be  compared  with  the  tabulated  chi-square 
with  one  degree  of  freedom.  Since  most  chi-square  tables  are  based  on  a 
two-tailed  distribution,  COVAST  may  be  compared  directly  at  the  appro- 
priate a level. 

Two  conditions  must  be  considered  for  a one-tailed  test.  If  the 
alternate  hypothesis  is: 

Hn:  The  new  treatment  mean  is  less  than  the  standard  treatment 
mean  or  the  covariable  is  important  or  both, 

then  one  would  expect  T$,  the  total  of  column  5 from  Table  21,  to  be  less 
than  T6,  the  total  of  column  6;  i.e.,  expect  CI10  - Ioi)  <0.  If  it  is 

and  if  C < -z  . then  reject  H where  z is  from  the  standard  normal 
O’  J 0 0 01 

distribution.  If  Ts  is  greater  than  T6,  then  do  not  reject  H00. 

If  the  alternate  hypothesis  is: 

Hn : The  new  treatment  mean  is  greater  than  the  standard  treat- 
ment mean  or  the  covariable  is  important  or  both, 

then  one  would  expect  Ts  > T6;  i.e.,  expect  (I10  - I0i)  > 0.  If  it  is 
and  if  C > za>  then  reject  H00.  If  Ts  < T6,  do  not  reject  Hoo* 
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EXAMPLE 


An  Air  Force  officer  developed  a new  incendiaiy  material  for  a 
standard  round  and  claimed  that  his  was  better  than  those  presently  in 
stock.  An  independent  Air  Force  test  group  was  given  the  task 
of  conducting  a comparison  test.  Due  to  a time  limitation,  it  was 
decided  to  test  the  new  incendiary  against  one  which  was  readily 
available.  The  test  plan  called  for  shooting  both  incendiary  rounds 
against  fuel  cells  instrumented  to  give  inside  temperature  readings 
in  degrees  centigrade.  The  fuel  cells  contained  a common  fuel  and 
the  decision  as  to  a fire  or  no  fire  was  determined  by  the  project 
officer.  Questionable  situations  were  resolved  by  using  a time 
history  plot  of  the  temperature.  Ties  in  the  data  occurring  while 
ranking  the  observations  were  eliminated  by  using  the  time  of  day 
a shot  occurred.  Table  21  presents  the  data  ordered  by  temperature. 
The  ignitors  are  represented  by  an  H for  the  standard  and  an  M for  the 
new  material.  The  results  of  each  shot  was  a fire  (F)  or  a no  fire 
(N) • Columns  5 through  8,  respectively,  represent  the  number  of 
times  a fire  was  started  by  material  M at  low  temperature  and  material 
H started  no  fire  at  higher  temperatures ; the  number  of  times  material 
H started  a fire  at  low  temperatures  and  material  M started  no  fire 
at  higher  temperatures;  the  number  of  times  material  M started  a fire 
at  low  temperatures  and  material  M started  no  fire  at  higher  tempera- 
tures; and  the  number  of  times  material  H started  a fire  at  low- 
temperatures  and  material  H started  no  fire  at  higher  temperatures. 

The  column  total  for  ^ is  I10,  and  the  column  total  for  ^ is  I01. 

In  testing  the  one-sided  hypothesis  with  H i i : H is  a better 
incendiary  than  M or  that  temperature  has  no  affect  upon  the  results 
or  both,  one  would  use  Ury's  C.  First  check  to  see  if  column  6 > 
column  5.  It  is;  therefore,  C is  calculated  and  found  to  be 

h 


= 2.1338 

Comparing  this  to  the  standard  normal  distribution,  the  observed 
significance  level,  a,  is  found  to  be  0.017.  This  was  determined  to 
be  both  statistically  and  practically  significant,  so  H was 
rejected  at  the  98.3  percent  confidence  level. 


12  (11  - 33 ) 2 
(29  + M(il  + 35) 


APPENDIX  A 


IDENTITY:  DEVIATION  OF  OBSERVATIONS  FROM  THE  MEAN 


The  identity  for  deviation  of  observations  from  the  mean  and 
the  identity  cross  product  deviation  from  the  means  will  be  developed 
in  this  appendix.  They  are  used  in  the  development  of  the  test 
statistic  U2. 

Deviation  of  observations  from  the  mean: 


(y.  - y ) = (y.  - y ) + (y. . 

ij  ..  i.  ..  ij 


Squaring  both  sides  and  summing  over  i and  j , one  obtains 


E (y. , - y )2  5 E (y.  - y )2  + z (y.A  - y.  )2 

ij  ij  ..  ij  1*  ••  ij  1* 


+ 2 E (y . - y ) (y*  a - ) 

^ i.  ••  U i. 


Working  only  with  the  cross  product  term,  we  have 


2 E (y  - y 
ij  ’ 


- y.  ) = 2 E (y 
i.  i l. 


)[E  (y  - y )] 

J ij  i- 


= 0 . 


Therefore,  the  cross  product  term  sums  to  zero  and  the  identity  may 
be  expressed  as 

1 (yij  - ?..)2  5 E ni  - y..)2  + £ (y±j  - yj.)2] 


T + K ' ' 

yy  yy 


Cross  product  deviation  of  observations  from  the  mean 


uij ' 5..)(j'u  - }..)  E ~ h.> + (5i.  - 2..»]  [<yi3  -?i.> 

+ (y.  - y )1 


= (z  - z.  )(y. , - y ) + (z  - z.  )(y.  - y ) 

1 • 1.  lj  1.  X. 


+ (z.  - z ) (y.  . - y.  ) 

x.  ..  ij  i. 


+ (zA  - z )(y.  - y ) 


First  sum  over  j then  over  i. 

5 [ [z  % - - 5i.> 

+ (yi.  -y..> r (zij  - ;i.> 

J 

+ (z.  - z ) £ (y. , - y.  ) 

, X.  ..  J lj  X. 

+ I (z.  - z ) (y  - y )] 

j ••  x‘ 


E Cz< 

- z 

My. 

- y ) 

ij  i* 

• 

. i 

• • • 

+ E 

(z. 

- z. 

)(y.  - y.  ) 

ij 

U 

i. 

ij  i. 

E 

(z 

- z 

)(y. y ) 

U 

ij 

• • 

ij 

T + E ^ ^ 
zy  zy 
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APPENDIX  B 


VARIANCE  OF  THE  ADJUSTED  TREATMENT  MEAN 


The  variance  of  the  estimated  adjusted  treatment  mean  is  developed 
in  this  appendix.  It  is  used  in  the  discussion  in  Section  V,  "Decision 
Rule.” 


V (£  ) = V [y  - 6 (z.  -z  )] 
i . 1 . 1 . 


= V (y.  ) - (z.  - z )2  V (8)  - 2 cov  [y.  , 8 (z. 

l . i . . . i • i • 

Consider  the  above  equation  term  by  term: 

A 

cov  [y  , 6 (z  - z )]  = (z.  - z ) cov  (y.  , 8) 

i.  i.  ..  i*  ••  1* 


B ■ - 'i.Xjfij  - VTJhj  ■ v>2  • 


K = (z-  - z ) E (z- 1 

i.  ..  ±i  U 


So  we  have 


cov  [y  , 8 (z 

i.  l. 


Ei.)/.E  (zij  " *i.)2  ’ 

ij 


K cov  [y.  , (y.  - y.  )] 

X.  lj  1. 


= K [cov  (y.  , y.)  - COV  (y,  , y.  )] 

1.  lj  x.  x. 

= K [cov  (-i—  E y , y ) - V (y  )] 
n ij  iJ  iJ  i • 
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1 


A 

cov  [y.  , 8 (z. 

-»-  • 1 • 


-5  )]  = K [- 


y-z 


— a 2 ] 

n y-z 


= o 


Now  consider  the  term: 


v (&)  = V [E  (z  - z.  ) (y  - y )/E  (z  - z )2] 

ij  i-  iJ  i-  ij  i j i. 

= [1/E  (z  - i )2]2  [v| E (z  - z.  ) (y  ) | 

i.I  ij  ij  i.  1.1 


- V|E  (z  - Z.  ) y.  1]  . 

ij  ^ l*  1- 


y is  constant  with  respect  to  J,  and  E (z  - z ) = 0 • 

j ij  i. 

therefore  V E (z.  -5  ) y ] = 0 , 

[•ij  ■'■•J 

This  then  leaves 


V (8)  = Cl/E  (z13  - z^  )2]2  [E  (z. , - i4  )¥  a 


ij 


2 „ 2 


ij 


ij  “i. 


y-  z 


The  term  [E  (z  - z.  )32  in  the  numerator  will  divide  out  with  one  of 
ij  1 ’ 

the  terms  in  the  denominator  if  they  are  corrected  to  the  proper  i™ 
treatment.  Continuing,  we  have 


ij 


= a 2 / E 

y.z  zz 


77 


APPENDIX  C 


IDENTITY:  SUM  OF  SQUARES  OF  ALL  DIFFERENCES 

The  identity,  the  sum  of  squares  of  all  differences  which  is 
identical  to  2n  times  the  sum  of  squares  about  the  mean,  will  be 
developed  in  this  appendix. 


E (z  - z )2  = E E (z.  - z )2 
ik  1 k 1 k 1 k 
i/k 

= E E (z?  - 2z.  z + z,2) 
i k 1 1 k k 

= II  z?  - 21  z.  I z,  + Z I z,2 

lk  ik  i k 

=Enz2-2nznz+Enz2 
i 1 k k 

= 2n  (E  z?  - n z2) 
i 1 

= 2n  E (zi  - z)2 


79 

(The  reverse  of  this  page  is  blank) 


REFERENCES 


1.  Bancroft,  T.A. , Topics  Intermediate  Statistical  Methods,  Iowa  State 
University  Press,  1968,  pp.  VS- Si. 

2.  Bross,  Irwin  D.J.,  'Taking  a Covariable  into  Account,"  Journal  of 
the  American  Statistical  Association,  September  1964,  pp.  725-736. 

3.  Cochran,  W.G.,  "Analysis  of  Covariance:  Its  Nature  and  Uses," 

Biometrics,  Septenfcer  1957,  pp.  261-280. 

4.  Coons,  I.,  "The  Analysis  of  Covariance  as  a Missing  Plot  Technique," 
Biometrics,  September  1957. 

5.  Draper,  N.R.,  and  Smith,  H. , Applied  Regression  Analysis.  Wiley  and 
Sons,  Inc.,  1966.,  pp.  1-43. 

6.  Hazel,  L.N.,  "The  Covariance  Analysis  of  Multiple  Classification 
Tables  With  Unequal  Subclass  Numbers,"  Biometrics,  Volume  2,  No.  2, 

April  1946,  pp.  21-25. 

7.  Katti,  S.K.,  "Multiple  Covariance  Analysis,"  Biometrics,  Volume  21 
No.  4,  December  1965,  pp.  957-974. 

8.  Morrison,  D.F.,  Multivariate  Statistical  Methods,  McGraw  Hill, 

1967,  pp.  180-182. 

9.  Puri,  Madan  L.  and  Sen,  Pranab  K.,  "Analysis  of  Covariance  Based  on 
General  Rank  Scores,"  The  Annals  of  Mathematical  Statistics,  1969, 

Volume  40,  No.  2,  pp.  610-618. 

10.  Quade,  Dana,  "Rank  Analysis  of  Covariance,"  Journal  of  the  American 
Statistical  Association,  Volume  62,  September  - December  1967,  pp.  1187-1200. 

11.  Searle,  S.R. , Linear  Models,  Wiley  and  Sons,  Inc.,  1971,  pp.  340-361. 

12.  Snedecor,  G.W.  and  Cochran,  W.G.,  Statistical  Methods,  6th  Ed., 

1967,  pp.  419-446. 

13.  Steel,  R.G.D.  and  Torrie,  J.H.,  Principles  and  Procedures  of 
Statistics , McGraw-Hill,  1960,  pp.  305-330. 

14.  Ury,  Hans  K.,  "A  Note  on  Taking  a Covariable  Into  Account," 

Journal  of  the  American  Statistical  Association,  Volume  61,  March  - 
June  1966,  pp.  490-495. 


81 

(The  reverse  of  this  page  is  blank) 


INITIAL  DISTRIBUTION 


f 


HQ  USAF/SAMI  1 
USAFE/DOQ  1 
FACAF/DOOFQ  1 
TAC/DRA  1 
ASD/ENFEA  1 
AUL/LSE-  71-249  1 
SAC/NRI  (STINFO  LIB)  1 
NWC/CODE  318  1 
NWC/CODE  317  1 
OO-ALC/MMWMP  2 
AFIS/INTA  1 
DDC  2 
AFATL/DLODL  2 
AFATL/DL  1 
AFATL/DLY  1 
ADTC/XRS  1 
AFATL/DLYV  20 
AFATL/DLYW  10 
USA  ENG  WatWay  Ex  Sta/VMS  1 
BAL  RESRCH  LAB/AMXBR-VL  1 
AMSAA/DRXSY-J  2 
USAMSAA/ DRXSY - S 1 
ARRADCCM/DRDAR-  LCU-  TM  1 
AFOSR/NM  1 
ARMY  RESEARCH  OFFICE/NC  1 
OKLAHOMA  ST  UNIV/Dept  of  Stat  20 
TAC/INAT  1 
USA  TRADOC  SYS  ANN  ACT  1 
ASD/XRP  1 
COMIPAC/I-232  1 


83 

(The  reverse  of  this  page  is  blank) 


I 


